DevOps

Docker Best Practices for Full-Stack Applications

From development to production: containerization strategies that ensure consistency, security, and scalability. Lessons from deploying EduFly on AWS.

Harsh RastogiHarsh Rastogi
Dec 20, 2024Updated Jan 10, 202614 min
DockerDevOpsDeploymentKubernetesAWS

Why Docker Changed Everything for EduFly

When I built EduFly — an AI-powered School ERP — the deployment story was messy. My local machine ran Node 18, the staging server had Node 16, and the production VM had a random version of npm that broke our postinstall scripts. Teachers couldn't access the attendance system because a production deploy failed silently due to a missing native dependency.

Docker solved all of that. One Dockerfile, one image, one behavior — everywhere. It helped us achieve 99.9% uptime on AWS and I haven't looked back since.

These containerization patterns have followed me to Asynq.ai and Modelia.ai, where reliability is non-negotiable — our Shopify merchants depend on AI features working 24/7 during their peak sales hours.

Multi-Stage Builds

The single most impactful Docker optimization. A naive Dockerfile copies your entire project (including node_modules, dev dependencies, source files, tests) into the image. A multi-stage build separates the build phase from the runtime phase:

dockerfile
# === Stage 1: Build ===
FROM node:20-alpine AS builder
WORKDIR /app

# Copy package files first (layer caching)
COPY package.json package-lock.json ./
RUN npm ci

# Copy source and build
COPY tsconfig.json ./
COPY prisma ./prisma/
COPY src ./src/
RUN npx prisma generate
RUN npm run build

# === Stage 2: Production ===
FROM node:20-alpine AS production
WORKDIR /app

# Create non-root user
RUN addgroup -g 1001 appgroup && adduser -u 1001 -G appgroup -s /bin/sh -D appuser

# Copy only production artifacts
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/prisma ./prisma
COPY --from=builder /app/package.json ./

# Switch to non-root user
USER appuser

EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3   CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

CMD ["node", "dist/server.js"]

This reduced our EduFly production image from 1.2GB to 180MB — an 85% reduction. Smaller images mean faster deploys, faster autoscaling, and lower storage costs.

Layer Caching Strategy

Docker builds images layer by layer, and each layer is cached. The key insight: order your Dockerfile commands from least-changing to most-changing:

dockerfile
# GOOD order — package files rarely change, so npm ci is cached
COPY package.json package-lock.json ./
RUN npm ci
COPY src ./src/
RUN npm run build

# BAD order — copying src first invalidates npm ci cache on every code change
COPY . .
RUN npm ci
RUN npm run build

At Modelia.ai, this optimization reduced our CI/CD build time from 8 minutes to 2 minutes because npm install (the slowest step) is cached unless we change dependencies.

Docker Compose for Development

A well-structured docker-compose.yml makes developer onboarding take minutes instead of hours. When a new engineer joins Modelia.ai, they run one command:

yaml
version: '3.8'

services:
  api:
    build:
      context: .
      dockerfile: Dockerfile.dev
    ports:
      - "3000:3000"
    volumes:
      - ./src:/app/src  # Hot reload
      - /app/node_modules  # Don't override node_modules
    environment:
      - DATABASE_URL=postgresql://postgres:postgres@db:5432/modelia
      - REDIS_URL=redis://redis:6379
      - SHOPIFY_API_KEY=test_key
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_started

  db:
    image: postgres:16-alpine
    ports:
      - "5432:5432"
    environment:
      POSTGRES_DB: modelia
      POSTGRES_PASSWORD: postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

  mailhog:
    image: mailhog/mailhog
    ports:
      - "1025:1025"  # SMTP
      - "8025:8025"  # Web UI

volumes:
  postgres_data:

Security Best Practices

At Bharat Electronics Limited (BEL), building frontend interfaces for the Indian Airforce taught me deployment discipline that I apply everywhere. In defence, even a frontend bug that exposes wrong data is a serious incident. While commercial systems don't have those stakes, the same rigorous deployment principles produce more reliable software:

1. Never Run as Root

dockerfile
RUN addgroup -g 1001 appgroup && adduser -u 1001 -G appgroup -s /bin/sh -D appuser
USER appuser

2. Pin Image Versions

dockerfile
# BAD — "latest" today might be different tomorrow
FROM node:latest

# GOOD — pinned to exact digest for reproducibility
FROM node:20.11-alpine3.19

3. Scan Images for Vulnerabilities

We run Trivy in our CI pipeline at Modelia.ai:

yaml
- name: Scan Docker image
  run: |
    trivy image --severity HIGH,CRITICAL --exit-code 1 modelia-api:latest

4. Use .dockerignore

node_modules
.git
.env
.env.*
*.md
tests
coverage
.github

This prevents secrets and unnecessary files from being included in the build context (or accidentally ending up in the image).

5. No Secrets in Images

dockerfile
# BAD — secret is baked into the image layer (visible with docker history)
ENV API_KEY=sk-secret-key-123

# GOOD — secrets injected at runtime
CMD ["node", "dist/server.js"]
# Secrets provided via: docker run -e API_KEY=sk-secret-key-123

Health Checks

Every production container at Modelia.ai and previously at Asynq.ai includes a health check. This isn't optional — without health checks, your orchestrator (ECS, Kubernetes) can't tell if your container is alive but non-functional:

typescript
// health.ts — a comprehensive health check endpoint
app.get('/health', async (req, res) => {
  const checks = {
    database: false,
    redis: false,
    uptime: process.uptime(),
    memory: process.memoryUsage(),
  };

  try {
    await prisma.$queryRaw`SELECT 1`;
    checks.database = true;
  } catch (e) { /* database is down */ }

  try {
    await redis.ping();
    checks.redis = true;
  } catch (e) { /* redis is down */ }

  const healthy = checks.database && checks.redis;
  res.status(healthy ? 200 : 503).json(checks);
});

Production Deployment with ECS

At Modelia.ai, we deploy to AWS ECS Fargate. The workflow:

  • GitHub Actions builds the Docker image on PR merge
  • Trivy scans for vulnerabilities
  • ECR push — image pushed to Amazon Elastic Container Registry
  • ECS rolling update — new tasks start with the new image, old tasks drain connections and stop
  • Health check verification — ECS waits for the health check to pass before routing traffic

This gives us zero-downtime deployments. A deploy at Modelia.ai takes about 3 minutes from merge to live.

Debugging Containers

When things go wrong in production, you need to be able to investigate without SSH. Useful patterns:

bash
# View logs from a running container
docker logs --tail 100 -f container_name

# Execute a shell in a running container (for debugging only!)
docker exec -it container_name sh

# View resource usage
docker stats container_name

# Inspect container configuration
docker inspect container_name | jq '.[0].Config.Env'

Key Takeaways

  • Multi-stage builds are essential — they reduce image size by 80%+ and improve security by excluding build tools from production
  • Layer ordering matters — put rarely-changing layers first for better cache hits
  • Docker Compose for development, ECS/Kubernetes for production — same Dockerfile, different orchestration
  • Security scanning should be automated in CI/CD — Trivy catches vulnerabilities before they reach production
  • Always use health checks — they're the foundation of self-healing infrastructure
  • Never run as root, never hardcode secrets, always pin versions — lessons from Bharat Electronics Limited (BEL) that apply everywhere
  • Use .dockerignore — prevent secrets and node_modules from entering your build context
  • Invest in fast builds — Layer caching reduced our CI time from 8 to 2 minutes at Modelia.ai

Share this article

Harsh Rastogi - Full Stack Engineer

Harsh Rastogi

Full Stack Engineer

Full Stack Engineer building production AI systems at Modelia. Previously at Asynq and Bharat Electronics Limited. Published researcher.

Connect on LinkedIn

Follow me for more insights on software engineering, system design, and career growth.

View Profile