L3: Multi-Agent Orchestration

Multi-Agent Orchestration Using Hub-and-Spoke Architecture

As tasks become more complex, a single agent quickly hits a performance ceiling. If you give one agent 20 different tools and a massive set of conflicting system instructions (e.g., "be highly creative" but also "strictly adhere to compliance formatting"), the model experiences context degradation, routing failures, and hallucinations. The architectural solution is Multi-Agent Orchestration.

1. What is Hub-and-Spoke Architecture?

Hub-and-spoke (also known as a centralized topology) is the most reliable multi-agent design pattern for enterprise applications. It avoids the chaos of peer-to-peer agent communication by enforcing a strict hierarchy.

The Coordinator (The Hub): This is the primary agent. It is the only entity that interacts directly with the user. Its system prompt is focused entirely on task delegation, planning, and synthesis. It does not do the heavy lifting.
The Subagents (The Spokes): These are highly specialized worker agents. They are completely isolated from the user. Each subagent has a very narrow system prompt (e.g., "You are an SQL expert" or "You are a copy editor") and access only to the specific tools it needs.

2. The Architectural Workflow (Data Flow)

In a hub-and-spoke model, subagents never talk directly to each other. All data must flow back through the center.

Ingestion: The user asks the Coordinator a complex question (e.g., "Analyze our Q3 sales data and draft an executive summary").
Delegation (Handoff 1): The Coordinator realizes it needs data. It triggers a tool called ask_data_analyst_agent.
Subagent Execution: Your application code catches this tool call, spins up a completely new Claude API session (the Data Analyst subagent), passes it the database tools, and lets it run its own ReAct loop until it extracts the data.
Return to Hub: The Data Analyst subagent returns a clean data summary to the Coordinator. The Coordinator's context window is updated with this summary.
Delegation (Handoff 2): The Coordinator now triggers ask_writer_agent, passing it the data summary and asking for a drafted email.
Final Synthesis: The Writer returns the draft, and the Coordinator presents the final output to the user.

3. Implementing Subagents as Tools

From an API perspective, how does one agent "call" another?

In the Claude ecosystem, a subagent is simply a tool to the Coordinator. When you define the JSON schema for the Coordinator's tools, instead of defining a tool that runs a Python script, you define a tool that takes a string (task_description) and triggers a new LLM execution block. The Coordinator does not need to know that the "tool" it is using is actually another LLM.

4. Advantages of Hub-and-Spoke

Architects default to this pattern for three primary reasons:

Context Optimization (Token Efficiency): The user's massive conversation history stays strictly with the Coordinator. When the Coordinator calls a subagent, it only passes the specific instructions needed for that micro-task. This prevents the subagents from getting confused by irrelevant chat history and drastically reduces token costs.
Predictable Control Flow: Because all subagents return their results to the Hub, you eliminate infinite loops where two agents argue with each other. It ensures a deterministic, step-by-step progression.
Centralized Guardrails: You only need to enforce strict safety, PII filtering, and tone compliance on the Coordinator, acting as the final gatekeeper before outputting to the user.

5. Architectural Drawbacks

While highly reliable, this pattern introduces a Single Point of Failure. The Coordinator becomes a major bottleneck. If the Coordinator misunderstands the user's initial prompt, it will route the request to the wrong subagent, guaranteeing a failed outcome regardless of how smart the subagents are. Therefore, the Coordinator requires the most robust prompt engineering and the most capable model (typically Claude 3.5 Sonnet or Claude 3 Opus).