Harsh Rastogi

Why CI/CD Matters

At Asynq.ai, we deployed to production multiple times per day. Our Agentic AI hiring platform was evolving rapidly — new candidate evaluation models, recruiter dashboard features, and Shopify integration updates landing daily. Without a robust CI/CD pipeline, that velocity would be impossible. A manual deployment process that takes 30 minutes and requires SSH access to production servers doesn't scale when you're shipping 5 times a day.

GitHub Actions became our tool of choice for its tight integration with our Git workflow, generous free tier, and first-class Docker support. The same pipeline architecture now powers deployments at Modelia.ai and EduFly.

The Complete Workflow

Here's the production pipeline I've refined across three companies. It runs on every PR and push to main:

yaml

name: CI/CD Pipeline

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  NODE_VERSION: '20'
  REGISTRY: 123456789.dkr.ecr.ap-south-1.amazonaws.com
  IMAGE_NAME: modelia-api

jobs:
  # ===== Stage 1: Code Quality =====
  lint-and-typecheck:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'
      - run: npm ci
      - run: npm run lint
      - run: npm run type-check
      - run: npx prisma generate

  # ===== Stage 2: Tests =====
  test:
    runs-on: ubuntu-latest
    needs: lint-and-typecheck
    services:
      postgres:
        image: postgres:16-alpine
        env:
          POSTGRES_DB: test
          POSTGRES_PASSWORD: test
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
      redis:
        image: redis:7-alpine
        ports:
          - 6379:6379
    env:
      DATABASE_URL: postgresql://postgres:test@localhost:5432/test
      REDIS_URL: redis://localhost:6379
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: 'npm'
      - run: npm ci
      - run: npx prisma migrate deploy
      - run: npm test -- --coverage --forceExit
      - uses: codecov/codecov-action@v3
        with:
          files: ./coverage/lcov.info

  # ===== Stage 3: Build & Push Docker Image =====
  build:
    runs-on: ubuntu-latest
    needs: test
    if: github.ref == 'refs/heads/main'
    permissions:
      id-token: write
      contents: read
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/github-actions-deploy
          aws-region: ap-south-1
      - uses: aws-actions/amazon-ecr-login@v2
      - name: Build and push Docker image
        run: |
          docker build -t $REGISTRY/$IMAGE_NAME:$GITHUB_SHA -t $REGISTRY/$IMAGE_NAME:latest .
          docker push $REGISTRY/$IMAGE_NAME:$GITHUB_SHA
          docker push $REGISTRY/$IMAGE_NAME:latest

  # ===== Stage 4: Security Scan =====
  security-scan:
    runs-on: ubuntu-latest
    needs: build
    steps:
      - uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
          severity: 'HIGH,CRITICAL'
          exit-code: '1'

  # ===== Stage 5: Deploy =====
  deploy:
    runs-on: ubuntu-latest
    needs: [build, security-scan]
    if: github.ref == 'refs/heads/main'
    environment: production
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/github-actions-deploy
          aws-region: ap-south-1
      - name: Deploy to ECS
        run: |
          aws ecs update-service             --cluster production             --service modelia-api             --force-new-deployment
      - name: Wait for deployment
        run: |
          aws ecs wait services-stable             --cluster production             --services modelia-api

Testing Strategy

Our testing pyramid ensures confidence at every level:

Unit Tests (70% of tests)

Fast, isolated, testing individual functions and business logic:

typescript

// tests/services/pricing.test.ts
describe('PricingService', () => {
  it('calculates Shopify merchant subscription correctly', () => {
    const result = calculateSubscription({
      plan: 'professional',
      productCount: 500,
      aiRequestsPerMonth: 10000,
    });

    expect(result.monthlyPrice).toBe(49.99);
    expect(result.aiRequestsIncluded).toBe(10000);
    expect(result.overage).toBe(0);
  });

  it('applies overage charges when AI requests exceed plan limit', () => {
    const result = calculateSubscription({
      plan: 'starter',
      productCount: 100,
      aiRequestsPerMonth: 5000, // Starter plan includes 1000
    });

    expect(result.overage).toBe(4000);
    expect(result.overageCharge).toBe(40.00); // $0.01 per extra request
  });
});

Integration Tests (20% of tests)

Testing database interactions and API endpoints with real services. Note the PostgreSQL and Redis services in the GitHub Actions workflow — we test against real databases, not mocks:

typescript

// tests/api/candidates.integration.test.ts
describe('POST /api/candidates', () => {
  beforeEach(async () => {
    await prisma.candidate.deleteMany();
  });

  it('creates a candidate and triggers AI evaluation', async () => {
    const response = await request(app)
      .post('/api/candidates')
      .send({
        name: 'Jane Doe',
        email: 'jane@example.com',
        resumeUrl: 'https://s3.amazonaws.com/resumes/jane.pdf',
        jobId: testJob.id,
      })
      .expect(201);

    expect(response.body.id).toBeDefined();
    expect(response.body.stage).toBe('applied');

    // Verify database record
    const candidate = await prisma.candidate.findUnique({
      where: { id: response.body.id },
    });
    expect(candidate).not.toBeNull();
    expect(candidate?.email).toBe('jane@example.com');
  });
});

E2E Tests (10% of tests)

Critical user paths tested with Playwright:

typescript

// e2e/recruiter-flow.spec.ts
test('recruiter can view candidate pipeline and move to interview', async ({ page }) => {
  await page.goto('/dashboard');
  await page.click('[data-testid="candidates-tab"]');
  await expect(page.locator('.candidate-card')).toHaveCount(5);

  await page.click('.candidate-card:first-child');
  await page.click('[data-testid="schedule-interview"]');
  await page.fill('[name="interviewDate"]', '2025-02-15');
  await page.click('[data-testid="confirm-schedule"]');

  await expect(page.locator('.toast-success')).toBeVisible();
});

Security in CI/CD

Lessons from working at Bharat Electronics Limited (BEL), where deployment rigour for Airforce projects isn't optional, directly shaped our CI/CD security practices:

1. OIDC Instead of Long-Lived Credentials

Never store AWS access keys as GitHub Secrets. Use OIDC (OpenID Connect) for short-lived, automatically rotated credentials:

yaml

- uses: aws-actions/configure-aws-credentials@v4
  with:
    role-to-assume: arn:aws:iam::123456789:role/github-actions-deploy
    aws-region: ap-south-1
    # No access key! GitHub proves its identity to AWS via OIDC token

2. Dependency Scanning

Every PR is automatically checked for vulnerable dependencies:

yaml

dependency-audit:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - run: npm audit --audit-level=high
    - uses: github/codeql-action/analyze@v2
      with:
        languages: javascript-typescript

3. Branch Protection

At Modelia.ai, direct pushes to main are impossible. Every change requires:

›All CI checks passing
›At least one code review approval
›No unresolved conversations
›Linear commit history (squash merge)

4. Docker Image Scanning

Trivy runs on every built image before deployment. If a HIGH or CRITICAL vulnerability is found, the pipeline fails and the image is never deployed:

yaml

- uses: aquasecurity/trivy-action@master
  with:
    image-ref: modelia-api:latest
    severity: 'HIGH,CRITICAL'
    exit-code: '1'        # Fail the pipeline
    ignore-unfixed: true  # Don't fail on vulnerabilities without patches

Deployment Strategies

Rolling Updates (used at Modelia.ai)

New tasks start alongside old tasks. As new tasks pass health checks, traffic shifts to them. Old tasks drain connections and shut down:

›Zero downtime
›Gradual rollout — if the new version has issues, only a fraction of traffic is affected
›Automatic rollback on health check failure
›Takes 3-5 minutes for a full fleet rotation

Blue/Green (used for EduFly)

Two identical environments: Blue (current) and Green (new). Deploy to Green, run smoke tests, then switch the load balancer:

›Instant switchover
›Easy rollback — just switch back to Blue
›Higher cost (two environments running during deploy)
›Better for major version upgrades where gradual rollout is risky

Build Caching

Docker layer caching in GitHub Actions can dramatically speed up builds. At Modelia.ai, this reduced our build step from 8 minutes to 2 minutes:

yaml

- uses: docker/build-push-action@v5
  with:
    context: .
    push: true
    tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
    cache-from: type=gha
    cache-to: type=gha,mode=max

Monitoring Post-Deploy

After every deployment at Modelia.ai, we automatically:

›Run smoke tests against production — Hit critical endpoints and verify 200 responses
›Check error rates in CloudWatch — Compare error rate in the 5 minutes after deploy vs. the 5 minutes before
›Verify API response times — If p99 latency increases by more than 50%, trigger a rollback alert
›Send Slack notification with deployment summary — who deployed, what commit, link to the diff

yaml

post-deploy-verification:
  needs: deploy
  runs-on: ubuntu-latest
  steps:
    - name: Smoke test
      run: |
        for endpoint in /health /api/products /api/recommendations; do
          status=$(curl -s -o /dev/null -w "%{http_code}" "https://api.modelia.ai$endpoint")
          if [ "$status" != "200" ]; then
            echo "Smoke test failed: $endpoint returned $status"
            exit 1
          fi
        done

    - name: Notify Slack
      uses: slackapi/slack-github-action@v1
      with:
        payload: |
          {
            "text": "Deployed to production",
            "blocks": [
              {
                "type": "section",
                "text": {
                  "type": "mrkdwn",
                  "text": "*Deployment successful*
Commit: ${{ github.sha }}
Author: ${{ github.actor }}"
                }
              }
            ]
          }

Key Takeaways

›CI/CD should run on every PR, not just main — catch problems before they're merged
›Test against real databases in CI — mocks hide integration bugs (a lesson from Asynq.ai)
›Use OIDC for AWS credentials — never store long-lived access keys, a security lesson from BEL
›Scan Docker images before deploying — Trivy catches vulnerabilities before they reach production
›Use Docker layer caching — it reduced our build time from 8 to 2 minutes at Modelia.ai
›Always have automated rollback capability — if post-deploy metrics degrade, roll back automatically
›Post-deploy verification is not optional — smoke tests and metric comparison after every deployment
›Branch protection enforces process — required reviews, passing checks, and squash merges keep main clean
›Invest in your pipeline early — at EduFly, setting up CI/CD on day one saved hundreds of hours over the project lifetime

Written by Harsh Rastogi — AI Product Engineer leading AI product direction at Modelia. Connect with me on LinkedIn for more on Shopify, Generative AI, agentic systems, and production engineering.

CI/CD with GitHub Actions: Complete Setup Guide

Why CI/CD Matters

The Complete Workflow

Testing Strategy

Unit Tests (70% of tests)

Integration Tests (20% of tests)

E2E Tests (10% of tests)

Security in CI/CD

1. OIDC Instead of Long-Lived Credentials

2. Dependency Scanning

3. Branch Protection

4. Docker Image Scanning

Deployment Strategies

Rolling Updates (used at Modelia.ai)

Blue/Green (used for EduFly)

Build Caching

Monitoring Post-Deploy

Key Takeaways

Connect on LinkedIn

Related Articles

Gemini CLI Is Dead: 15-Minute Migration to Antigravity CLI Before June 18

Docker Best Practices for Full-Stack Applications

AWS Architecture for Startups: From Zero to Scale