Agent Design Patterns: From Chain-of-Thought to Orchestrator-Workers
The architectural blueprints that determine how AI agents think, plan, and collaborate
When Anthropic published their Building Effective Agents guide in late 2024, one observation stood out above the rest: "the most successful implementations weren't using complex frameworks or specialized libraries — they were building with simple, composable patterns." That finding came from working directly with dozens of teams deploying agents in production across industries.
The insight runs counter to the instinct most engineering teams have when they first encounter agentic AI. The temptation is to reach for the most sophisticated architecture immediately — multi-agent orchestration, complex state machines, recursive planning loops. But sophistication without purpose creates systems that are harder to debug, more expensive to run, and more likely to fail in ways nobody anticipated.
The reality is a spectrum. At one end sits the augmented Large Language Model (LLM) — a single model enhanced with retrieval, tools, and memory, but still operating within a single inference call. At the other end sits a fully autonomous multi-agent system where specialized agents coordinate, negotiate, and self-correct across complex workflows. Between those poles lie a set of well-defined design patterns, each with specific strengths, costs, and failure modes.
The key principle is worth stating explicitly: start with the simplest solution that could work, and add complexity only when you have evidence that simpler approaches fall short. A well-implemented prompt chain will outperform a poorly implemented multi-agent system every time — and it will cost a fraction as much to build, run, and maintain.
This article maps the full pattern landscape across three categories: reasoning patterns that govern how individual agents think, workflow patterns that orchestrate predetermined processing pipelines, and multi-agent orchestration patterns that coordinate autonomous agent collaboration. Understanding these patterns is the foundation for every architectural decision you will make when choosing a framework or designing an agent system.
"There's an important distinction between workflows and agents." Workflows are systems where LLMs and tools are orchestrated through predefined code paths. Agents are systems where LLMs dynamically direct their own processes and tool usage. Both can be built from the same composable patterns. — Anthropic, Building Effective Agents (2024)
Reasoning patterns operate at the cognitive level of a single agent. They determine how an AI agent processes information, decomposes problems, and arrives at decisions. These are the building blocks that everything else is constructed from.
CoT is the foundation for all agentic reasoning. It is fast, inexpensive, and remarkably effective for problems with a clear solution path. The limitation is structural: linear reasoning fails when the problem requires exploring multiple branches or backtracking from dead ends. For those scenarios, you need Tree-of-Thought.
ToT trades speed and cost for thoroughness. The approach introduces "significantly increased inference complexity" — each branch requires separate LLM calls, and the evaluation step adds another layer. Use ToT when the cost of a wrong answer exceeds the cost of exploring multiple paths: strategic planning, complex code generation, or any domain where backtracking from an early wrong assumption would be catastrophic.
ReAct grounds reasoning in real-world data by letting the model take actions (search, compute, API calls) and observe results before deciding the next step. This is the pattern powering most production agentic AI loops. The key risks are infinite loops when the agent gets stuck in unproductive cycles, and myopic behavior where the agent optimizes for the immediate next step rather than the overall goal. Both require explicit guardrails — iteration limits, progress checks, and fallback strategies.
Reflection is essential for robust autonomous operation. Without it, agents propagate early errors through their entire reasoning chain. The pattern adds latency and cost (at minimum doubling LLM calls), but the quality improvement is substantial for tasks where precision matters — code generation, research synthesis, compliance analysis. The Reflexion framework demonstrated that iterative self-reflection can meaningfully improve agent task completion rates, with the Reflexion paper reporting improvements from 80% to 91% on HumanEval (Shinn et al., 2023).
Anthropic's Building Effective Agents guide identifies five workflow patterns where LLM calls are orchestrated through predetermined code paths. These are not autonomous agents — they are structured pipelines that use LLMs as processing stages. The distinction matters because workflow patterns are more predictable, easier to test, and simpler to debug than fully autonomous agents.
Sequential LLM calls where each step's output becomes the next step's input. Gate checks between steps validate intermediate results before proceeding. Best for fixed subtask pipelines: generate draft, review for compliance, format output. The simplest workflow pattern, and often the right one.
Classify input, then direct to a specialized handler. A front-end triage mechanism that separates concerns. Customer service routing (billing versus technical versus account), model selection based on query complexity, or directing requests to domain-specific prompt templates. The router itself is typically a lightweight LLM call or classifier.
Run subtasks simultaneously, then aggregate results. Two variants: sectioning splits a task into independent pieces (write intro, body, and conclusion in parallel), and voting runs identical prompts multiple times to get diverse responses for ensemble reasoning. Used for guardrail checks, parallel code reviews, and multi-perspective analysis.
A central LLM dynamically decomposes tasks and delegates to worker LLMs. Unlike prompt chaining, the orchestrator decides at runtime how many workers to spawn and what subtasks to create. Effective for coding products where the orchestrator analyzes the codebase and assigns file-level changes to workers. This pattern mirrors human team structure: a lead breaks down the problem, specialists execute, and the lead integrates results.
One LLM generates, another evaluates and provides iterative feedback. The generator refines its output based on the evaluator's critique until quality criteria are met. Particularly effective for literary translation (where nuance matters), multi-round search refinement, and any task with clear evaluation criteria but subjective output quality. This is the quality improvement loop — the Reflection pattern applied at the workflow level.
When a single agent's capabilities are insufficient, you move to multi-agent systems — multiple specialized agents working together on a shared objective. The orchestration pattern you choose determines how those agents communicate, who has authority, and how conflicts are resolved. These patterns have direct analogs in organizational design, which is not a coincidence. The challenges of coordinating autonomous entities are the same whether those entities are humans or software agents.
Dedicated evaluator agents monitor operational agents in real time. The guardian pattern is not optional for high-stakes environments — it is mandatory. Guardian agents check outputs against safety constraints, compliance rules, and behavioral boundaries before results reach end users. This pattern maps directly to the Cloud Security Alliance (CSA) MAESTRO taxonomy's agent security layer and the Open Worldwide Application Security Project's (OWASP) excessive agency risk controls. In multi-tool agentic systems, guardian agents are the last line of defense against compositional risk.
Selecting the right pattern is a function of four variables: task predictability, error tolerance, latency budget, and scale. No single pattern dominates across all four. The art is matching your constraints to the pattern that satisfies them with the minimum viable complexity.
One practical heuristic from teams in production: if you can draw the workflow on a whiteboard before building it, use a workflow pattern. If the task decomposition depends on the input, use an agent pattern. The first category gives you predictability and testability. The second gives you flexibility at the cost of observability.
Many production systems combine patterns across layers. A routing workflow at the front end directs queries to specialized agents, each of which uses ReAct internally, with a guardian agent monitoring all outputs. This layered approach gives you the benefits of structured workflows (predictability at the system level) with the flexibility of agentic patterns (adaptability at the task level). Our Blueprint Quest walks you through exactly this kind of multi-layer architecture decision.
Each major agent framework takes a different approach to implementing these patterns. The framework choice constrains which patterns are easy, which are possible, and which require custom engineering. Here is how the leading frameworks map to the pattern landscape as of early 2026. For a deep-dive comparison, see LangChain vs. LangGraph vs. LlamaIndex: Choosing Your Agent Framework.
| Framework | Architecture Model | Best Pattern Fit | Production Readiness |
|---|---|---|---|
| LangChain / LangGraph | Graph-based, centralized control with explicit state management | All workflow patterns, hierarchical multi-agent, ReAct | High — production-grade with LangSmith observability |
| AutoGen (Microsoft) | Event-driven, decentralized agent choreography | Group chat / decentralized, concurrent, evaluator-optimizer | High — enterprise backing, v0.4 rewrite improved stability |
| CrewAI | Role-based crews with sequential or hierarchical processes | Hierarchical, sequential pipeline, orchestrator-workers | Medium — rapid prototyping, growing production adoption |
| LlamaIndex | Document-centric with Retrieval-Augmented Generation (RAG) optimization | Prompt chaining, ReAct with retrieval, sequential | High — strongest for knowledge-intensive agent tasks |
| Akka (Lightbend) | Actor-based message passing, enterprise-grade stateful agents | Concurrent, hierarchical, guardian (supervisor hierarchy built-in) | High — battle-tested in distributed systems, newer to AI agents |
The framework ecosystem is evolving rapidly. Model Context Protocol (MCP) is emerging as a standardized tool integration layer across frameworks, and Google's Agent-to-Agent (A2A) protocol aims to enable cross-framework agent communication. Both reduce the lock-in risk of any single framework choice, which is the most important trend to watch in this space. See Cloud Agent Platforms for how the hyperscalers are implementing these patterns at scale.
- Anthropic, "Building Effective Agents" (2024)
- Wei et al., "Chain-of-Thought Prompting Elicits Reasoning" (2022)
- Yao et al., "Tree of Thoughts: Deliberate Problem Solving" (2023)
- Yao et al., "ReAct: Synergizing Reasoning and Acting" (2022)
- Cloud Security Alliance, "MAESTRO: AI Agent Threat Taxonomy"
- Stanford HAI, "2025 AI Index Report"
Ready to build? Try the Blueprint Quest to design your agent architecture interactively, or explore the full Agentic AI Hub for security, governance, and framework deep-dives.