Tool Misuse, Excessive Agency, and the MCP Compositional Risk
When authorized tools become attack vectors and protocol composability creates compound vulnerabilities
Prompt injection gets the headlines, but excessive agency may be the more dangerous structural problem. The OWASP Top 10 for LLM Applications classifies excessive agency as LLM08 and frames it explicitly: "Agents are purpose-built for agency — they autonomously select tools, execute multi-step plans, and chain actions together. Excessive Agency in agentic systems is not a misconfiguration edge case but a fundamental design tension." That distinction matters. This is not a bug to be patched. It is a tension to be managed across every agent deployment.
Traditional software has static permission boundaries. An API endpoint either has access to a database table or it does not. Agentic systems break this model because the agent dynamically decides which tools to invoke, in what order, and with what parameters. The attack surface is not the tools themselves but the agent's autonomous authority to use them.
The CSA MAESTRO framework reinforces this at Layer 4 (Tool & API Integration), where tool misuse and excessive agency is classified as the first threat with a Critical severity rating. Across OWASP and MAESTRO, the consensus is clear: how you scope an agent's operational authority is the single most consequential security decision in any agentic deployment. Documenting those authority boundaries in a Behavioral Bill of Materials is the governance counterpart to these security controls.
OWASP identifies three distinct root causes behind excessive agency, each representing a different failure mode. Understanding the distinction is critical because the mitigations differ for each.
- ❯Excessive Functionality: Can you list every tool your agent has access to? Is each one required for its intended operation?
- ❯Excessive Permissions: Does each tool operate with the minimum required privileges? Could a read-only scope replace an admin scope?
- ❯Excessive Autonomy: Are high-impact actions — deletions, financial transactions, external communications — gated by human approval before execution?
These three causes compound. An agent with excessive functionality, operating under excessive permissions, with no human-in-the-loop for high-impact actions is the maximum-risk configuration. And it is exactly the configuration that many development teams deploy during prototyping, then never tighten before production.
Beyond these three root causes, OWASP identifies two additional attack vectors specific to multi-agent architectures. The confused deputy problem occurs when an agent operating with service-level privileges is tricked into performing unauthorized actions on behalf of a lower-privileged user. Delegation chain escalation occurs in multi-agent systems where privilege escalation happens through agent delegation chains, with each handoff expanding the effective permission scope. We cover the confused deputy problem in detail below.
OWASP draws a critical distinction between excessive agency and tool misuse. Excessive agency means the agent has too much capability, permission, or autonomy in the first place. Tool misuse means the tools are used within authorized boundaries but for unintended purposes. The OWASP Agentic Threats classification (AGENT-T02) defines it directly: "Attackers manipulate AI agents to abuse their integrated tools through deceptive prompts or commands while operating within the agent's authorized permissions."
This includes agent hijacking, where an agent ingests adversarial data and subsequently executes unintended tool interactions while remaining within its authorized permission scope. NIST identified agent hijacking as a key emerging threat in January 2025. The distinction matters for defense: you cannot prevent tool misuse by restricting permissions alone, because the agent is already operating within its authorized scope. You need behavioral monitoring and parameter validation.
The agent ingests adversarial data — embedded in a document, email, or web page — that causes unintended tool invocations while remaining within its authorized permission scope. The agent is not exceeding its authority. It is being directed by an attacker who has found a way to inject instructions into the agent's data stream. NIST documented this as a key emerging threat in evaluations published in January 2025.
Source: OWASP AGENT-T02 • NIST Agent Hijacking Evaluations (Jan 2025)The attacker manipulates the agent into chaining multiple legitimate tools in a sequence that achieves an unauthorized objective. Each individual tool call is within scope. The composite outcome is not. For example, an agent might retrieve sensitive data via an external API and embed it in a user-visible response through another tool, bypassing intended security controls. No single tool call violates policy, but the chain produces an unauthorized result.
Source: OWASP AGENT-T02 • OWASP ASI T&M v1.0a, pp. 13-14Attackers exploit the Model Context Protocol tool interface to trick agents into calling tools with maliciously crafted parameters. Because MCP provides a standardized interface for tool discovery and invocation, a compromised or malicious MCP server can serve poisoned tool descriptions that alter how the agent interprets tool functionality.
Source: OWASP AGENT-T02 • CSA MAESTRO L4-T4The confused deputy is a classic security vulnerability, but agentic AI makes it structurally worse. The OWASP ASI document provides the canonical definition for agentic contexts: "A Confused Deputy vulnerability arises when an AI agent (the 'deputy') has higher privileges than the user but is tricked into performing unauthorized actions on the user's behalf. This typically occurs when an agent lacks proper privilege isolation and cannot distinguish between legitimate user requests and adversarial injected instructions."
Traditional confused deputy attacks require finding a privileged service and tricking it. In agentic AI, agents are designed to be deputies — they act on behalf of users. They typically operate under Non-Human Identities (NHIs) with service-level credentials. And unlike traditional user authentication, NHIs may lack session-based oversight, increasing the risk of privilege misuse or token abuse.
"If an AI agent is allowed to execute database queries but does not properly validate user input, an attacker could trick it into executing high-privilege queries that the attacker themselves would not have direct access to."
OWASP ASI Threats & Mitigations v1.0a, p. 13The practical impact is severe. Agents can chain multiple tools in unexpected ways, bypassing intended security controls. OWASP documents a specific pattern where an agent retrieves sensitive data via an external API and embeds it in a user-visible response through another tool. The individual operations are authorized. The composite action is a data breach.
Non-Human Identity (NHI) risks amplify the confused deputy problem. NHIs — machine accounts, service identities, and agent-based API keys — play a key role in agentic AI security. Agents often operate under NHIs when interfacing with cloud services, databases, and external tools. When an agent's NHI tokens are exploited, the attacker gains the agent's full service-level access, not just the invoking user's permissions. OWASP classifies NHI token abuse as the first attack vector under Privilege Compromise (AGENT-T03).
Dynamic permission escalation compounds the risk further. OWASP notes that agentic AI "redefines privilege compromise because it goes beyond predefined actions and will exploit any misconfigurations or gaps in dynamic access." Implicit privilege escalation can occur when AI agents inherit excessive permissions from user sessions or service tokens, leading to unauthorized operations that no single configuration review would catch.
The Model Context Protocol (MCP) creates a universal integration surface where tool discovery, invocation, and data exchange occur dynamically between agents and external services. That universality is the value proposition. It is also the attack surface.
CSA MAESTRO classifies MCP Compositional Risk as L4-T4 (High severity): "Attackers can exploit MCP's compositional nature by registering malicious tool servers, poisoning tool descriptions to manipulate agent behavior, or intercepting the standardized protocol to inject unauthorized tool calls into agent workflows."
The CSA Agentic AI Red Teaming Guide specifically calls out cross-server attacks, instructing testers to "assess the ability of the agent to ignore one of its integrated MCP server's instructions to hijack/change control flow for another MCP server connected to the same agent." This is not theoretical. When an agent connects to multiple MCP servers, a compromised server can inject instructions that redirect the agent's interactions with other, legitimate servers.
An attacker registers a tool server that appears legitimate but serves malicious functionality. Because MCP standardizes how agents discover and connect to tool servers, a convincingly named malicious server can intercept tool invocations intended for legitimate services.
Source: CSA MAESTRO L4-T4The attacker modifies tool descriptions to manipulate how the agent interprets and invokes tools. Since agents rely on natural-language tool descriptions to decide when and how to use tools, poisoned descriptions can cause the agent to invoke tools with parameters the attacker controls or to select a malicious tool over a legitimate one.
Source: CSA MAESTRO L4-T4Intercepting the standardized MCP protocol to inject unauthorized tool calls into agent workflows. The standardization of MCP means that once an attacker understands the protocol format, they can craft injections that the agent processes as legitimate tool interactions.
Source: CSA MAESTRO L4-T4Registering tools with names that collide with legitimate tools to hijack invocations. If two MCP servers expose tools with the same name, the agent may route calls to the attacker's server instead of the intended one, especially in environments without strict server-level namespace enforcement.
Source: CSA MAESTRO L4-T4The compositional nature of MCP means these vectors compound. A tool namespace collision combined with tool description poisoning can redirect an agent's entire workflow to attacker-controlled infrastructure without triggering any single-tool permission violation. This is the fundamental challenge: the security model for individual tools does not account for the emergent behavior of tool composition.
The OWASP Securing Agentic Applications Guide defines a taxonomy of operational capabilities (KC6) that maps directly to tool misuse risk profiles. Each capability type represents a different attack surface with different consequences when exploited. The risk escalates from parameter pollution at the limited end to catastrophic failure at the critical systems end.
The OWASP framework also flags code execution agents as a distinct high-severity category (AGENT-T10). "Many agent frameworks explicitly enable code generation and execution as a core capability. Data analysis agents execute Python code. DevOps agents run shell commands. Code assistant agents generate and test code. This creates a direct bridge from prompt injection to remote code execution." The prompt-to-RCE pipeline — where a prompt injection leads to malicious code generation, then execution, then system compromise — represents one of the most severe threat chains in agentic security.
A related but often-overlooked vector is hallucinated package installation, where an agent hallucinates non-existent package names and installs attacker-controlled typosquatted packages. This converts a hallucination into a supply chain compromise without any adversarial input required.
These are not hypothetical scenarios. They are documented incidents and formalized threat model scenarios from OWASP's reference architectures. Each illustrates a different dimension of tool misuse and excessive agency in practice.
Not all agentic architectures carry the same tool misuse risk. The OWASP ASI document and the Securing Agentic Applications Guide identify specific vulnerability profiles for each common pattern. Your architecture choice is a security decision.
| Pattern | Risk Level | Key Vulnerability |
|---|---|---|
| Tool Use Pattern | Highest | Direct tool invocation controlled by LLM output |
| ReAct (Reason + Act) | High | Interleaved reasoning/action cycles amplify injection-to-execution chains |
| Hierarchical Agent | High | Orchestrator compromise cascades to all sub-agents |
| Collaborative Swarm | Very High | Peer trust assumptions lead to cascading compromise |
| Reflection Pattern | Medium | Self-critique loops can be manipulated to justify tool misuse |
| RAG Pattern | Medium | Poisoned retrieval results can direct tool invocation |
The collaborative swarm pattern carries the highest risk profile because peer trust assumptions mean a single compromised agent can cascade across the entire swarm. There is no central orchestrator to enforce policy boundaries. Hierarchical patterns concentrate risk at the orchestrator: if the orchestrator is compromised, every sub-agent inherits the compromise. These are not implementation bugs. They are architectural trade-offs that must be weighed against the operational benefits each pattern provides.
The supply chain dimension adds another layer. MAESTRO L4-T6 identifies that agentic frameworks like LangChain, AutoGen, and CrewAI introduce supply chain risks through compromised dependencies, malicious tool packages, or vulnerable framework versions. As OWASP cautions: "Frameworks are great for proof of concept, but when implemented in production create more dependencies. If you modify a framework to fit your unique needs, then later that framework is patched due to a vulnerability being discovered, you're left needing a rapid update that could have significant impacts on your system."
The defenses converge on a single organizing principle: apply complete mediation — enforce authorization in downstream systems, not in the LLM. OWASP states this explicitly. Security controls must exist at the tool and API level. You cannot rely on the agent to self-police. The model is not a security boundary.
The OWASP Securing Agentic Applications Guide adds a critical design pattern: when an agent executes against an API, database, or entity that has granular user permissions, the agent should assume the permission of the user who invoked it. This "user privilege assumption" pattern enforces granular security controls on the agent, preventing it from returning information the user should not have access to.
- Minimize tools to only those necessary for the agent's purpose
- Avoid open-ended tools (shell access, unrestricted URL fetching)
- Enforce least-privilege permissions on all tool connections
- Execute operations in the user's security context, not shared service accounts
- Require human approval for high-impact and irreversible actions
- Implement rate limiting on tool invocations
- Enforce strict tool access verification and parameter validation
- Monitor tool usage patterns for anomalous sequences
- Validate agent instructions against expected behavioral baselines
- Set clear operational boundaries with tool-specific rate limits
- Implement execution logs tracking all tool calls for anomaly detection
- MCP server allowlisting with cryptographic identity verification
- Tool description integrity validation against known-good manifests
- Transport-level encryption and authentication for all MCP connections
- Tool capability auditing before granting agent access to new MCP servers
- Granular RBAC and ABAC with dynamic access validation
- Down-scope agent privileges when operating on behalf of users
- Time-based restrictions on privilege elevation with automatic downgrade
- Block cross-agent privilege delegation unless explicitly authorized
The OWASP Securing Agentic Applications Guide provides architecture-specific security guidance. The defenses you need depend on the agentic pattern you deploy. A single-agent system with kill-switch capabilities has a fundamentally different security profile than a swarm with peer-to-peer trust.
For single-agent deployments, the OWASP guide recommends:
- Implement monitoring for anomalies: resource monitoring, IO monitoring, and behavioral monitoring
- Set up alerts for suspicious events that indicate tool misuse patterns
- Build emergency off-switches (kill switches) to immediately revoke access privileges
- Implement input validation and content filtering at the agent boundary
For hierarchical architectures, the orchestrator is both the control point and the single point of failure:
- Inter-agent communication validation at the orchestrator level
- Orchestrator-level security monitoring with policy enforcement
- Task delegation audit trails tracking every agent-to-agent handoff
- Harden the orchestrator as the highest-priority security target
Swarm patterns have the highest tool misuse risk. Without a central authority, security must be distributed:
- Peer-to-peer trust verification before accepting delegated tasks
- Distributed monitoring across all swarm participants
- Consensus-based security decisions for high-impact actions
- Blast radius testing to measure cascade effects when one agent is compromised
The CSA Red Teaming Guide adds concrete testing procedures for each architecture type, including authorization and control hijacking tests, permission escalation testing (checking whether agents retain temporary privileges after task completion), role inheritance exploitation testing, least privilege validation, and separation of control plane from execution environment. For organizations deploying agents that interact with critical systems, the guide recommends physical system manipulation testing, IoT device interaction testing, and impact chain and blast radius assessment to measure how far unauthorized actions propagate through connected systems.
For a deeper look at the broader agentic AI threat landscape including the OWASP, MITRE ATLAS, and CSA MAESTRO frameworks that inform these defenses, see our companion article. For the prompt injection threat that drives many tool misuse scenarios, see Prompt Injection in Agentic Systems.
Ready to test your agent security knowledge? The Agent Blueprint Quest walks you through real architecture decisions including tool scoping, permission models, and orchestration security. For the latest threat intelligence, visit the Security News Center. Organizations subject to regulation will find the compliance implications of excessive agency covered in depth at the EU AI Act Hub and the NIST AI RMF Hub. For enterprise-level governance strategies that address tool misuse at the policy layer, see the AI Governance Hub. Or explore the full Secure pillar for the complete threat defense stack.