Build Your First AI Agent in TypeScript

Everyone talks about “AI agents,” but strip away the marketing and an agent is just a loop: the model thinks, optionally calls a tool, reads the result, and repeats until it’s done. That’s it. In this post we’ll build a working agent in plain TypeScript — no LangChain, no abstractions you can’t see through.

By the end you’ll understand the three moving parts every agent framework is really wrapping: the tool schema, the agent loop, and the termination condition.

The mental model

A non-agentic LLM call is one round trip: prompt in, text out. An agent turns that into a conversation the model has with itself and your tools:

Send the model the user’s goal plus a list of tools it may use.
The model replies — either with a final answer, or a request to call a tool.
If it called a tool, you run it and feed the result back.
Go to step 2.

The loop ends when the model stops asking for tools and returns prose.

Defining a tool

Tools are just functions plus a JSON schema describing how to call them. Here’s a trivially simple one — a calculator — so we can focus on the wiring rather than the tool itself.

const tools = [
  {
    name: "calculate",
    description: "Evaluate a basic arithmetic expression.",
    input_schema: {
      type: "object",
      properties: {
        expression: { type: "string", description: "e.g. '42 * (7 + 1)'" },
      },
      required: ["expression"],
    },
  },
] as const;

function runTool(name: string, input: { expression: string }): string {
  if (name === "calculate") {
    // In real code, never eval untrusted input. Use a math parser.
    return String(Function(`"use strict";return (${input.expression})`)());
  }
  throw new Error(`Unknown tool: ${name}`);
}

The description fields are not decoration — they’re the model’s only documentation. Treat them like API docs you’re writing for a junior engineer.

The agent loop

Now the heart of it. We keep a running messages array and loop until the model returns an answer with no tool calls.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

async function runAgent(goal: string): Promise<string> {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: goal },
  ];

  // Bound the loop so a confused model can't spin forever.
  for (let step = 0; step < 10; step++) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 1024,
      tools,
      messages,
    });

    messages.push({ role: "assistant", content: response.content });

    // No tool requested → the model is done.
    if (response.stop_reason !== "tool_use") {
      return response.content
        .filter((b) => b.type === "text")
        .map((b) => b.text)
        .join("");
    }

    // Run every tool the model asked for and return the results.
    const toolResults = response.content
      .filter((b) => b.type === "tool_use")
      .map((b) => ({
        type: "tool_result" as const,
        tool_use_id: b.id,
        content: runTool(b.name, b.input as { expression: string }),
      }));

    messages.push({ role: "user", content: toolResults });
  }

  throw new Error("Agent exceeded step budget");
}

Three things to notice:

stop_reason is the termination signal. When it isn’t "tool_use", the model has chosen to answer instead of act. That’s your exit.
We append the assistant turn before running tools. The conversation must stay coherent — the model needs to see its own tool request alongside the result.
The step budget is non-negotiable. Without it, a single bad reasoning chain can rack up real money. Always cap the loop.

What this cost me — Josh: My first agent didn’t have that bounded loop. A tool kept returning an error the model couldn’t recover from, so it just retried the same call over and over — 30-odd iterations before I noticed and killed the process. That single run cost more than the rest of the day’s development combined. Now the for loop bound and a tool_result that actually explains the failure go in before I write anything else.

Why the loop matters more than the framework

Once you’ve written this, every agent library suddenly looks familiar. LangChain “agents,” the OpenAI Assistants API, CrewAI — they’re all variations on this loop with extra ergonomics bolted on: memory, retries, parallel tool calls, streaming. Useful, but not magic.

When you debug a misbehaving agent, you’ll come back to these same questions:

Did the tool description tell the model what it needed?
Did the tool result give the model something it could act on?
Is the loop terminating for the right reason?

Where to go next

Add a second tool and watch the model choose between them. Then try giving it a tool that can fail, and handle the error by feeding the failure back as a tool_result — a robust agent recovers from tool errors rather than crashing.

If you want to go deeper on making tool outputs reliable, the same discipline applies to prompts: see Prompt Engineering Patterns That Survive Production. And when your agent needs to reason over your own documents, you’ll want retrieval.

Build Your First AI Agent in TypeScript

The mental model

Defining a tool

The agent loop

Why the loop matters more than the framework

Where to go next

Keep reading

MCP Servers: Giving Your AI Coding Agent Real Tools

Task Decomposition for AI Coding Agents: How to Break Down Complex Features

Steering a Drifting AI Agent: How to Correct Course Mid-Task