Key Takeaway
By the end of this blueprint you will have a production-ready multi-agent system built on LangGraph where a Supervisor agent routes tasks to specialized Research, Code, and Review agents — each with their own tools, memory, and failure boundaries — all observable through LangSmith and deployable behind a FastAPI service.
Prerequisites
Prerequisites
- Python 3.11+ with a working knowledge of async/await and type hints
- Familiarity with LLM APIs (Anthropic or OpenAI) and the concept of tool/function calling
- Basic understanding of directed graphs (nodes, edges, cycles)
- A LangSmith account (free tier works) for tracing and evaluation
- Docker installed locally for the PostgreSQL checkpointer
- An Anthropic API key or OpenAI API key for LLM calls
Why Multi-Agent Architecture?
Single-agent systems hit a wall once the task surface area grows beyond what a single system prompt can handle cleanly. A monolithic agent that researches, writes code, reviews that code, runs tests, and summarizes results ends up with a sprawling prompt that confuses the LLM, burns excessive tokens re-reading irrelevant instructions, and makes failures hard to isolate. When a code-generation step fails, you have no clean way to retry just that step — you restart the entire chain.
Multi-agent architectures solve this by decomposing the workflow into specialized agents, each with a focused prompt and its own tool set. A Supervisor agent acts as the orchestrator: it reads the user request, decides which specialist to invoke next, and inspects each specialist's output before routing to the next step. This gives you three immediate wins: narrower prompts that produce better outputs, isolated failure domains with targeted retries, and the ability to swap or upgrade individual agents without touching the rest of the graph.
Multi-agent is not always the right call. If your task is linear and well-scoped (e.g., summarize a document, classify a ticket), a single agent or a simple chain is simpler and cheaper. Reach for multi-agent when you have branching logic, multiple tool domains, or steps that benefit from different system prompts.
Consider multi-agent when you see these signals: your single prompt is over 2,000 tokens of instructions; you need different tools for different phases of the workflow; you want human review at specific checkpoints; or you need to parallelize independent sub-tasks for latency reduction. This blueprint addresses all four scenarios.
Architecture Overview
The system is built on a LangGraph StateGraph where each node is an agent or a utility function, and edges define the routing logic between them. At the center sits the Supervisor — a lightweight LLM call whose sole job is to read the current state and decide which agent should act next (or whether the task is complete). Each specialist agent (Research, Code, Review) runs its own tool-calling loop, writes results back to shared state, and returns control to the Supervisor. A PostgreSQL-backed checkpointer persists graph state at every step so long-running workflows can survive process restarts.
Core Concepts
Agent State Management
LangGraph models state as a Python TypedDict that flows through every node in the graph. Each node receives the current state, performs its work, and returns a partial update that gets merged back. The key design decision is choosing your state schema carefully — it is the contract between all agents. Fields that use the Annotated type with a reducer function (like operator.add for lists) allow multiple agents to append to the same field without overwriting each other's results. This is how you accumulate messages, research findings, and code artifacts across the graph.
"""Agent state schema — the shared contract between all agents."""
from __future__ import annotations
import operator
from typing import Annotated, Literal, TypedDict
from langchain_core.messages import BaseMessage
class AgentState(TypedDict):
"""Shared state that flows through the entire graph.
Fields using Annotated[..., operator.add] are append-only —
multiple agents can write to them without overwriting each other.
"""
# The original user request (immutable after entry)
task: str
# Conversation history — appended to by every agent
messages: Annotated[list[BaseMessage], operator.add]
# Which agent should act next (set by supervisor)
next_agent: Literal["research", "code", "review", "FINISH"]
# Accumulated research findings
research_notes: Annotated[list[str], operator.add]
# Generated code artifacts
code_artifacts: Annotated[list[dict], operator.add]
# Review feedback items
review_comments: Annotated[list[str], operator.add]
# Iteration counter to prevent infinite loops
iteration_count: int
# Final synthesized response
final_response: strUse `Annotated[list[...], operator.add]` for any field that multiple agents write to. This prevents the classic bug where Agent B overwrites Agent A's output. The reducer merges updates by concatenation rather than replacement.
Message Passing Patterns
Agents communicate through the shared state, not by calling each other directly. When the Research Agent finishes, it appends its findings to research_notes and adds an AIMessage to messages summarizing what it found. The Supervisor reads these updates on its next invocation and decides whether to send the task to the Code Agent, ask for more research, or finalize the response. This decoupled message-passing pattern means you can add, remove, or reorder agents without changing any agent's internal logic — they only need to read from and write to agreed-upon state fields.
Conditional Edges and Routing
LangGraph's add_conditional_edges method is how you implement the Supervisor's routing decisions. After the Supervisor node runs, a routing function inspects the state's next_agent field and returns the name of the node to invoke next. If the Supervisor sets next_agent to "FINISH", the routing function returns the special END constant which terminates the graph. This gives you deterministic, inspectable routing — unlike chain-of-thought routing where the LLM implicitly decides what to do next inside a single prompt.
"""Routing logic for the supervisor's conditional edges."""
from langgraph.graph import END
from state import AgentState
def route_supervisor(state: AgentState) -> str:
"""Read the supervisor's routing decision from state.
Returns the node name to invoke next, or END to terminate.
"""
next_agent = state["next_agent"]
if next_agent == "FINISH":
return END
# Guard against infinite loops
if state.get("iteration_count", 0) >= 10:
return END
return next_agentStep-by-Step Implementation
Step 1: Project Setup
Start by creating a clean project with pinned dependencies. We will use langgraph for the orchestration graph, langchain-anthropic for Claude LLM calls (you can swap in langchain-openai if you prefer GPT-4), langsmith for tracing, and pydantic for structured tool outputs. The psycopg driver is needed for the PostgreSQL checkpointer that makes your graph resumable.
# Create project directory
mkdir multi-agent-system && cd multi-agent-system
python -m venv .venv && source .venv/bin/activate
# Install core dependencies
pip install \
"langgraph>=0.2.60" \
"langchain-core>=0.3.30" \
"langchain-anthropic>=0.3.12" \
"langsmith>=0.2.10" \
"pydantic>=2.10" \
"psycopg[binary]>=3.2" \
"python-dotenv>=1.0" \
"tavily-python>=0.5"
# Create .env file
cat <<'DOTENV' > .env
ANTHROPIC_API_KEY=sk-ant-...
LANGSMITH_API_KEY=lsv2_...
LANGSMITH_PROJECT=multi-agent-blueprint
LANGSMITH_TRACING=true
TAVILY_API_KEY=tvly-...
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/agents
DOTENV# Start PostgreSQL for the checkpointer (Docker)
docker run -d \
--name agent-postgres \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=agents \
-p 5432:5432 \
postgres:16-alpinePin your langchain ecosystem packages to compatible minor versions. The LangChain ecosystem moves fast — a mismatch between langchain-core and langgraph versions is the most common source of import errors in new projects.
Step 2: Define the Agent State Schema
Unlock the full Knowledge Base
This article continues for 64 more sections. Upgrade to Pro for full access to all 93 articles.
That's just $0.11 per article
- Full access to all blueprints, frameworks, and playbooks
- Interactive checklists with progress tracking
- Downloadable templates (.xlsx, .pptx, .docx)
- Quarterly Technology Radar updates