Backend Development

Redis Caching Strategies for High-Performance APIs

Implementing effective caching patterns that reduced our API response times by 60% at Asynq.ai. Learn cache invalidation, TTL strategies, and common pitfalls.

Harsh RastogiHarsh Rastogi
Nov 15, 202411 min
RedisCachingPerformanceBackendNode.js

The Caching Imperative

At Asynq.ai, our Agentic AI hiring platform processed thousands of candidate evaluations daily. Each evaluation involved fetching candidate profiles, assessment scores, interview feedback, and AI-generated compatibility ratings. Without caching, every dashboard load made 15+ database queries and 3 AI model inference calls. The result: 1.2-second page loads that made recruiters frustrated.

Redis caching brought that down to 120ms — a 10x improvement. Here's exactly how we did it, and how I've refined these patterns at Modelia.ai where we cache Shopify product catalogs, AI recommendation results, and merchant configuration.

Cache-Aside (Lazy Loading)

The most common and safest pattern — check cache first, fall back to database, populate cache on miss:

typescript
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

async function getCandidateProfile(id: string): Promise<CandidateProfile> {
  const cacheKey = `candidate:${id}`;

  // Step 1: Check cache
  const cached = await redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }

  // Step 2: Cache miss — fetch from database
  const candidate = await prisma.candidate.findUnique({
    where: { id },
    include: {
      assessmentScores: true,
      interviews: { include: { interviewer: true } },
      aiEvaluation: true,
    },
  });

  if (!candidate) throw new NotFoundError('Candidate not found');

  // Step 3: Populate cache with TTL
  await redis.set(cacheKey, JSON.stringify(candidate), 'EX', 3600); // 1 hour

  return candidate;
}

This pattern is safe because a cache miss always falls back to the source of truth. The worst case is a slower response, never stale data being served indefinitely.

Write-Through Cache

For data that must be immediately consistent after writes — like when a recruiter updates a candidate's status at Asynq.ai:

typescript
async function updateCandidateStatus(id: string, status: CandidateStatus): Promise<Candidate> {
  // Step 1: Update database (source of truth)
  const candidate = await prisma.candidate.update({
    where: { id },
    data: { status, updatedAt: new Date() },
    include: { assessmentScores: true, interviews: true },
  });

  // Step 2: Update cache immediately
  const cacheKey = `candidate:${id}`;
  await redis.set(cacheKey, JSON.stringify(candidate), 'EX', 3600);

  // Step 3: Invalidate any list caches that include this candidate
  const listKeys = await redis.keys('candidates:list:*');
  if (listKeys.length > 0) {
    await redis.del(...listKeys);
  }

  return candidate;
}

At Modelia.ai, we use write-through for merchant settings and Shopify product data — when a merchant updates their AI preferences or a Shopify webhook notifies us of a product change, the cache is updated atomically with the database.

TTL Strategy

Getting TTL (Time To Live) right is one of the most impactful caching decisions. Too short and you lose the benefits; too long and you serve stale data. At Modelia.ai, our TTL strategy is based on data volatility and business requirements:

typescript
// TTL configuration — centralized for consistency
const CACHE_TTL = {
  // Rarely changes, safe to cache long
  merchantConfig: 24 * 60 * 60,    // 24 hours
  userProfile: 60 * 60,            // 1 hour
  productCatalog: 30 * 60,         // 30 minutes

  // Changes frequently, short TTL
  aiRecommendations: 5 * 60,       // 5 minutes
  searchResults: 2 * 60,           // 2 minutes
  dashboardStats: 60,              // 1 minute

  // Security-sensitive, moderate TTL
  sessionData: 24 * 60 * 60,       // 24 hours
  rateLimits: 60,                   // 1 minute
  authTokenBlacklist: 7 * 24 * 60 * 60, // 7 days
} as const;

async function cacheSet(key: string, value: unknown, ttlKey: keyof typeof CACHE_TTL) {
  await redis.set(key, JSON.stringify(value), 'EX', CACHE_TTL[ttlKey]);
}

Cache Invalidation

Phil Karlton famously said there are only two hard things in computer science: cache invalidation and naming things. He was right about the first one. Here's how we handle it:

Event-Driven Invalidation

At Asynq.ai, database changes emit events that invalidate relevant cache keys:

typescript
// Event-driven cache invalidation
prisma.$use(async (params, next) => {
  const result = await next(params);

  // After any write operation, invalidate related caches
  if (['create', 'update', 'delete', 'updateMany', 'deleteMany'].includes(params.action)) {
    const model = params.model?.toLowerCase();

    if (model === 'candidate') {
      const id = params.args?.where?.id;
      if (id) await redis.del(`candidate:${id}`);
      // Also invalidate list caches
      const listKeys = await redis.keys(`candidates:list:*`);
      if (listKeys.length) await redis.del(...listKeys);
    }

    if (model === 'product') {
      const id = params.args?.where?.id;
      if (id) await redis.del(`product:${id}`);
      // Invalidate merchant's product catalog cache
      const merchantId = result?.merchantId || params.args?.data?.merchantId;
      if (merchantId) await redis.del(`catalog:${merchantId}`);
    }
  }

  return result;
});

Versioned Cache Keys

For atomic cache busting when multiple related keys need to change together:

typescript
async function getCatalogVersion(merchantId: string): Promise<number> {
  const version = await redis.get(`catalog-version:${merchantId}`);
  return version ? parseInt(version) : 1;
}

async function invalidateCatalog(merchantId: string): Promise<void> {
  await redis.incr(`catalog-version:${merchantId}`);
}

// Cache key includes version — when version changes, old keys naturally expire
async function getProductCatalog(merchantId: string) {
  const version = await getCatalogVersion(merchantId);
  const cacheKey = `catalog:${merchantId}:v${version}`;

  const cached = await redis.get(cacheKey);
  if (cached) return JSON.parse(cached);

  const catalog = await fetchCatalogFromDB(merchantId);
  await redis.set(cacheKey, JSON.stringify(catalog), 'EX', 1800);
  return catalog;
}

Rate Limiting with Redis

At Modelia.ai, our Shopify extension faces the internet — it needs rate limiting to prevent abuse. Redis makes this trivial with its atomic increment and TTL:

typescript
async function rateLimit(identifier: string, limit: number, windowSec: number): Promise<boolean> {
  const key = `ratelimit:${identifier}:${Math.floor(Date.now() / (windowSec * 1000))}`;

  const current = await redis.incr(key);
  if (current === 1) {
    await redis.expire(key, windowSec);
  }

  return current <= limit;
}

// Usage in middleware
async function rateLimitMiddleware(req: Request, res: Response, next: NextFunction) {
  const merchantId = req.headers['x-merchant-id'] as string;
  const allowed = await rateLimit(merchantId, 100, 60); // 100 requests per minute

  if (!allowed) {
    return res.status(429).json({ error: 'Rate limit exceeded. Try again in 60 seconds.' });
  }

  next();
}

Session Management

We use Redis for session storage at both Asynq.ai and Modelia.ai. It's faster than database lookups and automatically handles expiration:

typescript
async function createSession(userId: string, metadata: SessionMetadata): Promise<string> {
  const sessionId = crypto.randomUUID();
  const session = {
    userId,
    createdAt: Date.now(),
    ...metadata,
  };

  await redis.set(
    `session:${sessionId}`,
    JSON.stringify(session),
    'EX',
    86400 // 24 hours
  );

  return sessionId;
}

async function getSession(sessionId: string): Promise<Session | null> {
  const data = await redis.get(`session:${sessionId}`);
  if (!data) return null;

  // Extend session on activity (sliding expiration)
  await redis.expire(`session:${sessionId}`, 86400);

  return JSON.parse(data);
}

Monitoring Cache Performance

You can't optimize what you don't measure. We track cache hit rates, miss rates, and latency:

typescript
class CacheMonitor {
  private hits = 0;
  private misses = 0;

  async get(key: string): Promise<string | null> {
    const value = await redis.get(key);
    if (value) {
      this.hits++;
    } else {
      this.misses++;
    }

    // Log stats every 1000 operations
    if ((this.hits + this.misses) % 1000 === 0) {
      const hitRate = (this.hits / (this.hits + this.misses) * 100).toFixed(1);
      console.log(`Cache hit rate: ${hitRate}% (hits: ${this.hits}, misses: ${this.misses})`);
    }

    return value;
  }
}

Results at Scale

After implementing Redis caching across both companies:

At Asynq.ai:

  • API response time: 1.2s to 120ms (90% reduction)
  • Database load: Reduced by 45%
  • Cache hit rate: 87% across all endpoints
  • Dashboard load time: Under 200ms consistently

At Modelia.ai:

  • Product catalog API: 400ms to 15ms (96% reduction)
  • AI recommendation serving: 2.5s to 180ms (cache warm)
  • Shopify webhook processing: 800ms to 120ms with cached merchant config
  • Monthly database costs reduced by 30% due to lower query volume

Common Pitfalls

From experience at Bharat Electronics Limited (BEL), Asynq.ai, and Modelia.ai, here are the caching mistakes I've learned to avoid:

  • Cache stampede — When a popular cache key expires, hundreds of requests simultaneously hit the database. Use mutex locks or stale-while-revalidate patterns.
  • Caching errors — If a database query fails, don't cache the error response. Future requests will get the cached error instead of retrying.
  • Over-caching — Not everything needs caching. If a query takes 5ms and runs 10 times per minute, caching adds complexity without meaningful benefit.
  • Unbounded cache growth — Always set TTLs. A cache without TTL is a memory leak.
  • Inconsistent invalidation — If you update data in one place but forget to invalidate the cache, users see stale data. Event-driven invalidation via Prisma middleware solves this.

Key Takeaways

  • Start with cache-aside — it's the simplest and safest pattern. You can always add write-through later.
  • Set TTLs based on data volatility, not arbitrary values. Centralize TTL configuration.
  • Monitor cache hit rates — below 80% means your strategy needs tuning. Above 90% is excellent.
  • Event-driven invalidation via Prisma middleware prevents stale data across your entire application
  • Use Redis for more than caching — rate limiting, session management, and pub/sub are equally valuable
  • Version your cache keys for atomic invalidation of related data
  • Never cache errors — only cache successful responses
  • Redis Cluster for horizontal scaling when single-node memory limits are reached

Share this article

Harsh Rastogi - Full Stack Engineer

Harsh Rastogi

Full Stack Engineer

Full Stack Engineer building production AI systems at Modelia. Previously at Asynq and Bharat Electronics Limited. Published researcher.

Connect on LinkedIn

Follow me for more insights on software engineering, system design, and career growth.

View Profile