AI & Machine Learning

Building Agentic AI Systems: Real-World Patterns from Production

How we built autonomous AI agents at Asynq.ai and Modelia.ai that reason, plan, and execute multi-step tasks. Practical patterns for tool use, memory, orchestration, and guardrails in production agentic systems.

Harsh RastogiHarsh Rastogi
Mar 15, 202614 min
Agentic AIAI SystemsTypeScriptNode.jsLLM

What Makes AI "Agentic"?

The term "Agentic AI" gets thrown around a lot, but after building agentic systems at both Asynq.ai and Modelia.ai, I've developed a practical definition: an agentic system is one that can autonomously decompose a goal into steps, use tools to execute those steps, observe results, and adapt its plan accordingly — all without a human in the loop for each decision.

At Asynq.ai, we built AI agents that could autonomously evaluate job candidates — reading resumes, generating assessment questions tailored to the role, conducting multi-turn interviews, and producing structured evaluation reports. At Modelia.ai, our agents orchestrate complex fashion workflows — selecting models, generating AI images, running quality checks, and iterating until the output meets brand guidelines.

The gap between a chatbot and an agent is enormous. A chatbot responds to a single prompt. An agent pursues a goal across multiple steps, potentially calling dozens of tools, handling errors gracefully, and maintaining context across a long-running task.

The Core Architecture

Every agentic system I've built follows this fundamental loop:

┌─────────────┐
│   Observe    │ ← Gather context (user input, tool results, memory)
└──────┬──────┘
       ▼
┌─────────────┐
│    Think     │ ← LLM reasons about what to do next
└──────┬──────┘
       ▼
┌─────────────┐
│     Act      │ ← Execute a tool, API call, or generate output
└──────┬──────┘
       ▼
┌─────────────┐
│   Evaluate   │ ← Check if goal is met or if replanning is needed
└──────┬──────┘
       ▼
       └──── Loop back to Observe (or exit if done)

This is often called the ReAct (Reasoning + Acting) pattern. Here's how we implement it in TypeScript:

typescript
interface AgentState {
  goal: string;
  context: Message[];
  toolResults: ToolResult[];
  plan: string[];
  currentStep: number;
  maxIterations: number;
}

interface Tool {
  name: string;
  description: string;
  parameters: z.ZodSchema;
  execute: (params: unknown) => Promise<ToolResult>;
}

async function agentLoop(state: AgentState, tools: Tool[]): Promise<AgentResult> {
  let iteration = 0;

  while (iteration < state.maxIterations) {
    // 1. Observe — build the prompt with current context
    const prompt = buildPrompt(state);

    // 2. Think — ask the LLM what to do next
    const response = await llm.complete({
      messages: prompt,
      tools: tools.map(t => ({
        name: t.name,
        description: t.description,
        parameters: zodToJsonSchema(t.parameters),
      })),
    });

    // 3. Act — execute tool calls or return final answer
    if (response.toolCalls.length > 0) {
      for (const call of response.toolCalls) {
        const tool = tools.find(t => t.name === call.name);
        if (!tool) throw new AgentError(`Unknown tool: ${call.name}`);

        const result = await executeTool(tool, call.arguments);
        state.toolResults.push(result);
        state.context.push({
          role: 'tool',
          content: JSON.stringify(result),
          toolCallId: call.id,
        });
      }
    } else {
      // No tool calls — agent is done
      return { success: true, output: response.content, iterations: iteration };
    }

    // 4. Evaluate — check for completion or errors
    if (shouldTerminate(state)) {
      return { success: true, output: summarize(state), iterations: iteration };
    }

    iteration++;
  }

  return { success: false, output: 'Max iterations reached', iterations: iteration };
}

Tool Design: The Make-or-Break Factor

The quality of your tools determines the quality of your agent. At Asynq.ai, we learned this the hard way — our first agent had 20+ tools and the LLM constantly picked the wrong one. We refactored to 7 well-designed tools and accuracy jumped from 60% to 92%.

Principles for Good Tool Design

1. Clear, non-overlapping descriptions — If two tools sound similar, the LLM will confuse them. Each tool should have a unique, unambiguous purpose.

typescript
// Bad: overlapping tools
const tools = [
  { name: 'search_candidates', description: 'Search for candidates' },
  { name: 'find_candidates', description: 'Find candidates in the database' },
];

// Good: distinct, specific tools
const tools = [
  { name: 'search_candidates', description: 'Full-text search across candidate profiles by skills, experience, or keywords. Returns ranked results.' },
  { name: 'get_candidate_by_id', description: 'Retrieve a specific candidate record by their unique ID. Use when you already know who you want.' },
];

2. Atomic operations — Each tool should do one thing well. Don't create a "do everything" tool.

3. Rich error messages — When a tool fails, tell the agent *why* so it can recover:

typescript
async function executeToolSafely(tool: Tool, params: unknown): Promise<ToolResult> {
  try {
    const validated = tool.parameters.parse(params);
    const result = await tool.execute(validated);
    return { success: true, data: result };
  } catch (error) {
    if (error instanceof z.ZodError) {
      return {
        success: false,
        error: `Invalid parameters: ${error.errors.map(e => `${e.path.join('.')}: ${e.message}`).join(', ')}`,
      };
    }
    return {
      success: false,
      error: `Tool execution failed: ${error.message}. Try a different approach.`,
    };
  }
}

Memory: Short-Term and Long-Term

Agents need memory. Without it, they repeat mistakes, forget context, and can't learn from past interactions.

Short-Term Memory (Conversation Context)

This is the simplest form — the message history within a single agent run. The challenge is context window management. At Modelia.ai, our fashion workflow agents can run for 30+ tool calls. Naive message accumulation blows past token limits.

typescript
function manageContext(messages: Message[], maxTokens: number): Message[] {
  const systemMessage = messages[0]; // Always keep system prompt
  const recentMessages = messages.slice(-10); // Always keep recent context

  // Summarize older messages if we're approaching the limit
  const estimatedTokens = estimateTokenCount(messages);
  if (estimatedTokens > maxTokens * 0.8) {
    const oldMessages = messages.slice(1, -10);
    const summary = await summarizeMessages(oldMessages);
    return [systemMessage, { role: 'system', content: `Previous context summary: ${summary}` }, ...recentMessages];
  }

  return messages;
}

Long-Term Memory (Cross-Session)

For agents that interact with the same users or data repeatedly, we store embeddings of past interactions in a vector database:

typescript
// Store interaction summary after each agent run
await vectorStore.upsert({
  id: `interaction-${sessionId}`,
  embedding: await embed(interactionSummary),
  metadata: {
    userId,
    timestamp: new Date().toISOString(),
    outcome: result.success ? 'success' : 'failure',
    toolsUsed: result.toolsUsed,
  },
});

// Retrieve relevant past interactions at the start of a new run
const relevantMemories = await vectorStore.query({
  embedding: await embed(currentGoal),
  topK: 5,
  filter: { userId },
});

Guardrails: Keeping Agents Safe

An autonomous agent without guardrails is a liability. Here's how we prevent our agents from going off the rails:

1. Output Validation

Every agent output passes through a validation layer before reaching the user or triggering side effects:

typescript
const outputSchema = z.object({
  candidateScore: z.number().min(0).max(100),
  recommendation: z.enum(['strong_yes', 'yes', 'maybe', 'no', 'strong_no']),
  reasoning: z.string().min(50).max(2000),
  flaggedConcerns: z.array(z.string()).optional(),
});

function validateAgentOutput(raw: string): ValidatedOutput {
  const parsed = JSON.parse(raw);
  return outputSchema.parse(parsed); // Throws if invalid
}

2. Cost and Rate Limiting

At Asynq.ai, an early bug caused an agent to loop indefinitely, racking up API costs. Now every agent has hard limits:

typescript
const AGENT_LIMITS = {
  maxIterations: 25,
  maxTokensPerRun: 100_000,
  maxToolCallsPerMinute: 30,
  maxCostPerRun: 2.00, // USD
  timeoutMs: 300_000, // 5 minutes
};

3. Human-in-the-Loop for High-Stakes Actions

Some actions are too consequential for full autonomy. We use a simple approval pattern:

typescript
async function executeWithApproval(action: HighStakesAction): Promise<ActionResult> {
  if (action.requiresApproval) {
    const approval = await requestHumanApproval({
      action: action.description,
      context: action.reasoning,
      timeout: 60_000, // 1 minute to respond
    });

    if (!approval.approved) {
      return { status: 'blocked', reason: approval.reason };
    }
  }

  return action.execute();
}

Multi-Agent Orchestration

At Modelia.ai, our most complex workflows involve multiple agents collaborating. A fashion shoot workflow might involve:

  • Planner Agent — Decomposes the creative brief into specific tasks
  • Image Generation Agent — Generates AI fashion images with the right model, pose, and outfit
  • Quality Check Agent — Evaluates generated images against brand guidelines
  • Iteration Agent — Takes feedback from the QC agent and refines the generation

We orchestrate these using an event-driven architecture:

typescript
class AgentOrchestrator {
  private agents: Map<string, Agent> = new Map();
  private eventBus: EventEmitter = new EventEmitter();

  registerAgent(name: string, agent: Agent) {
    this.agents.set(name, agent);
    agent.on('complete', (result) => {
      this.eventBus.emit(`${name}:complete`, result);
    });
  }

  async executeWorkflow(workflow: WorkflowDefinition) {
    for (const step of workflow.steps) {
      const agent = this.agents.get(step.agentName);
      const input = this.resolveInputs(step.inputs, workflow.context);

      const result = await agent.run(input);
      workflow.context[step.outputKey] = result;

      // Check if we need to branch or retry
      if (step.onFailure === 'retry' && !result.success) {
        await this.retryWithBackoff(agent, input, step.maxRetries);
      }
    }
  }
}

Key Takeaways

  • Start simple — A single agent with 5-7 well-designed tools beats a complex multi-agent system with poor tool design
  • Tools are everything — Invest 80% of your effort in tool quality, descriptions, and error handling
  • Always set hard limits — Max iterations, cost caps, and timeouts are non-negotiable in production
  • Memory makes agents smart — Short-term context management and long-term vector storage dramatically improve agent quality
  • Human-in-the-loop is not a weakness — For high-stakes decisions, approval workflows build trust and prevent disasters
  • Test with adversarial inputs — At Asynq.ai, we learned that candidates sometimes try to manipulate the AI interviewer. Anticipate misuse.
  • Observability is critical — Log every tool call, LLM response, and state transition. When an agent makes a bad decision, you need to understand the full chain of reasoning.

Share this article

Harsh Rastogi - Full Stack Engineer

Harsh Rastogi

Full Stack Engineer

Full Stack Engineer building production AI systems at Modelia. Previously at Asynq and Bharat Electronics Limited. Published researcher.

Connect on LinkedIn

Follow me for more insights on software engineering, system design, and career growth.

View Profile