Secure Pillar

Tool Misuse, Excessive Agency, and the MCP Compositional Risk

When authorized tools become attack vectors and protocol composability creates compound vulnerabilities

2,847 Words 12 Min Read 6 Sources 23 Citations

01 // Threat Definition Excessive Agency: The Defining Agentic Threat Critical

Prompt injection gets the headlines, but excessive agency may be the more dangerous structural problem. The OWASP Top 10 for LLM Applications classifies excessive agency as LLM08 and frames it explicitly: "Agents are purpose-built for agency — they autonomously select tools, execute multi-step plans, and chain actions together. Excessive Agency in agentic systems is not a misconfiguration edge case but a fundamental design tension." That distinction matters. This is not a bug to be patched. It is a tension to be managed across every agent deployment.

Traditional software has static permission boundaries. An API endpoint either has access to a database table or it does not. Agentic systems break this model because the agent dynamically decides which tools to invoke, in what order, and with what parameters. The attack surface is not the tools themselves but the agent's autonomous authority to use them.

The CSA MAESTRO framework reinforces this at Layer 4 (Tool & API Integration), where tool misuse and excessive agency is classified as the first threat with a Critical severity rating. Across OWASP and MAESTRO, the consensus is clear: how you scope an agent's operational authority is the single most consequential security decision in any agentic deployment. Documenting those authority boundaries in a Behavioral Bill of Materials is the governance counterpart to these security controls.

Threat Classification

LLM08 OWASP Top 10 LLM

AGENT-T02 OWASP Agentic Threats

L4-T1 CSA MAESTRO Layer 4

02 // Root Causes Three Forms of Excessive Agency OWASP LLM08

OWASP identifies three distinct root causes behind excessive agency, each representing a different failure mode. Understanding the distinction is critical because the mitigations differ for each.

🔧

Excessive Functionality

Agent has access to tools or functions beyond what is needed for its intended operation. A read-only workflow agent that also has write and delete permissions. The extra capabilities sit dormant until an adversary or hallucination activates them.

🔒

Excessive Permissions

The agent's tool connections use overly privileged credentials: database admin instead of read-only, broad API scopes instead of minimal. The tools are appropriate, but the access level is not. One compromised tool call escalates to full system access.

⚡

Excessive Autonomy

Agent performs high-impact actions — deletions, financial transactions, sending communications — without human approval gates. The tools and permissions may be correct, but there is no checkpoint before irreversible operations execute.

Self-Assessment: Excessive Agency Diagnostic

❯Excessive Functionality: Can you list every tool your agent has access to? Is each one required for its intended operation?
❯Excessive Permissions: Does each tool operate with the minimum required privileges? Could a read-only scope replace an admin scope?
❯Excessive Autonomy: Are high-impact actions — deletions, financial transactions, external communications — gated by human approval before execution?

These three causes compound. An agent with excessive functionality, operating under excessive permissions, with no human-in-the-loop for high-impact actions is the maximum-risk configuration. And it is exactly the configuration that many development teams deploy during prototyping, then never tighten before production.

Beyond these three root causes, OWASP identifies two additional attack vectors specific to multi-agent architectures. The confused deputy problem occurs when an agent operating with service-level privileges is tricked into performing unauthorized actions on behalf of a lower-privileged user. Delegation chain escalation occurs in multi-agent systems where privilege escalation happens through agent delegation chains, with each handoff expanding the effective permission scope. We cover the confused deputy problem in detail below.

03 // Key Distinction Tool Misuse: Authorized but Unintended AGENT-T02

OWASP draws a critical distinction between excessive agency and tool misuse. Excessive agency means the agent has too much capability, permission, or autonomy in the first place. Tool misuse means the tools are used within authorized boundaries but for unintended purposes. The OWASP Agentic Threats classification (AGENT-T02) defines it directly: "Attackers manipulate AI agents to abuse their integrated tools through deceptive prompts or commands while operating within the agent's authorized permissions."

This includes agent hijacking, where an agent ingests adversarial data and subsequently executes unintended tool interactions while remaining within its authorized permission scope. NIST identified agent hijacking as a key emerging threat in January 2025. The distinction matters for defense: you cannot prevent tool misuse by restricting permissions alone, because the agent is already operating within its authorized scope. You need behavioral monitoring and parameter validation.

Agent Hijacking

Tool Chaining

MCP Manipulation

Agent Hijacking

The agent ingests adversarial data — embedded in a document, email, or web page — that causes unintended tool invocations while remaining within its authorized permission scope. The agent is not exceeding its authority. It is being directed by an attacker who has found a way to inject instructions into the agent's data stream. NIST documented this as a key emerging threat in evaluations published in January 2025.

Source: OWASP AGENT-T02 • NIST Agent Hijacking Evaluations (Jan 2025)

Tool Chaining Exploitation

The attacker manipulates the agent into chaining multiple legitimate tools in a sequence that achieves an unauthorized objective. Each individual tool call is within scope. The composite outcome is not. For example, an agent might retrieve sensitive data via an external API and embed it in a user-visible response through another tool, bypassing intended security controls. No single tool call violates policy, but the chain produces an unauthorized result.

Source: OWASP AGENT-T02 • OWASP ASI T&M v1.0a, pp. 13-14

MCP Server Manipulation

Attackers exploit the Model Context Protocol tool interface to trick agents into calling tools with maliciously crafted parameters. Because MCP provides a standardized interface for tool discovery and invocation, a compromised or malicious MCP server can serve poisoned tool descriptions that alter how the agent interprets tool functionality.

Source: OWASP AGENT-T02 • CSA MAESTRO L4-T4

04 // Confused Deputy When the Agent Becomes the Attack Vector NHI Risk

The confused deputy is a classic security vulnerability, but agentic AI makes it structurally worse. The OWASP ASI document provides the canonical definition for agentic contexts: "A Confused Deputy vulnerability arises when an AI agent (the 'deputy') has higher privileges than the user but is tricked into performing unauthorized actions on the user's behalf. This typically occurs when an agent lacks proper privilege isolation and cannot distinguish between legitimate user requests and adversarial injected instructions."

Traditional confused deputy attacks require finding a privileged service and tricking it. In agentic AI, agents are designed to be deputies — they act on behalf of users. They typically operate under Non-Human Identities (NHIs) with service-level credentials. And unlike traditional user authentication, NHIs may lack session-based oversight, increasing the risk of privilege misuse or token abuse.

"If an AI agent is allowed to execute database queries but does not properly validate user input, an attacker could trick it into executing high-privilege queries that the attacker themselves would not have direct access to."

OWASP ASI Threats & Mitigations v1.0a, p. 13

The practical impact is severe. Agents can chain multiple tools in unexpected ways, bypassing intended security controls. OWASP documents a specific pattern where an agent retrieves sensitive data via an external API and embeds it in a user-visible response through another tool. The individual operations are authorized. The composite action is a data breach.

Non-Human Identity (NHI) risks amplify the confused deputy problem. NHIs — machine accounts, service identities, and agent-based API keys — play a key role in agentic AI security. Agents often operate under NHIs when interfacing with cloud services, databases, and external tools. When an agent's NHI tokens are exploited, the attacker gains the agent's full service-level access, not just the invoking user's permissions. OWASP classifies NHI token abuse as the first attack vector under Privilege Compromise (AGENT-T03).

Dynamic permission escalation compounds the risk further. OWASP notes that agentic AI "redefines privilege compromise because it goes beyond predefined actions and will exploit any misconfigurations or gaps in dynamic access." Implicit privilege escalation can occur when AI agents inherit excessive permissions from user sessions or service tokens, leading to unauthorized operations that no single configuration review would catch.

05 // MCP Risk MCP Compositional Risk: The Protocol-Level Threat MAESTRO L4-T4

The Model Context Protocol (MCP) creates a universal integration surface where tool discovery, invocation, and data exchange occur dynamically between agents and external services. That universality is the value proposition. It is also the attack surface.

CSA MAESTRO classifies MCP Compositional Risk as L4-T4 (High severity): "Attackers can exploit MCP's compositional nature by registering malicious tool servers, poisoning tool descriptions to manipulate agent behavior, or intercepting the standardized protocol to inject unauthorized tool calls into agent workflows."

The CSA Agentic AI Red Teaming Guide specifically calls out cross-server attacks, instructing testers to "assess the ability of the agent to ignore one of its integrated MCP server's instructions to hijack/change control flow for another MCP server connected to the same agent." This is not theoretical. When an agent connects to multiple MCP servers, a compromised server can inject instructions that redirect the agent's interactions with other, legitimate servers.

Malicious Server

Tool Poisoning

Protocol MITM

Namespace Collision

Malicious MCP Server Registration

An attacker registers a tool server that appears legitimate but serves malicious functionality. Because MCP standardizes how agents discover and connect to tool servers, a convincingly named malicious server can intercept tool invocations intended for legitimate services.

Source: CSA MAESTRO L4-T4

Tool Description Poisoning

The attacker modifies tool descriptions to manipulate how the agent interprets and invokes tools. Since agents rely on natural-language tool descriptions to decide when and how to use tools, poisoned descriptions can cause the agent to invoke tools with parameters the attacker controls or to select a malicious tool over a legitimate one.

Source: CSA MAESTRO L4-T4

Protocol-Level Man-in-the-Middle

Intercepting the standardized MCP protocol to inject unauthorized tool calls into agent workflows. The standardization of MCP means that once an attacker understands the protocol format, they can craft injections that the agent processes as legitimate tool interactions.

Source: CSA MAESTRO L4-T4

Tool Namespace Collision

Registering tools with names that collide with legitimate tools to hijack invocations. If two MCP servers expose tools with the same name, the agent may route calls to the attacker's server instead of the intended one, especially in environments without strict server-level namespace enforcement.

Source: CSA MAESTRO L4-T4

The compositional nature of MCP means these vectors compound. A tool namespace collision combined with tool description poisoning can redirect an agent's entire workflow to attacker-controlled infrastructure without triggering any single-tool permission violation. This is the fundamental challenge: the security model for individual tools does not account for the emergent behavior of tool composition.

06 // Attack Surface The Operational Attack Surface: What Agents Can Touch KC6 Taxonomy

The OWASP Securing Agentic Applications Guide defines a taxonomy of operational capabilities (KC6) that maps directly to tool misuse risk profiles. Each capability type represents a different attack surface with different consequences when exploited. The risk escalates from parameter pollution at the limited end to catastrophic failure at the critical systems end.

KC6.1.1

Limited API Access

Agent generates parameters for predefined API calls. Risk: parameter pollution.

KC6.1.2

Extensive API Access

Agent generates entire API calls. Risk: full API attack surface exposed.

KC6.2.2

Extensive Code Execution

Agent runs LLM-generated code. Risk: arbitrary code execution.

KC6.3.2

Extensive DB Operations

Full CRUD database operations. Risk: data destruction and alteration.

KC6.4

Web Access

Browser operation exposes agent to untrusted web content and unauthorized actions.

KC6.5-6

PC & Critical Systems

OS/filesystem or SCADA access. Risk: file encryption, catastrophic system failure.

The OWASP framework also flags code execution agents as a distinct high-severity category (AGENT-T10). "Many agent frameworks explicitly enable code generation and execution as a core capability. Data analysis agents execute Python code. DevOps agents run shell commands. Code assistant agents generate and test code. This creates a direct bridge from prompt injection to remote code execution." The prompt-to-RCE pipeline — where a prompt injection leads to malicious code generation, then execution, then system compromise — represents one of the most severe threat chains in agentic security.

A related but often-overlooked vector is hallucinated package installation, where an agent hallucinates non-existent package names and installs attacker-controlled typosquatted packages. This converts a hallucination into a supply chain compromise without any adversarial input required.

07 // Case Studies Real-World Examples and Threat Models Evidence

These are not hypothetical scenarios. They are documented incidents and formalized threat model scenarios from OWASP's reference architectures. Each illustrates a different dimension of tool misuse and excessive agency in practice.

Tool Misuse

Slack AI Data Exfiltration (2024)

PromptArmor demonstrated that the Slack AI agent could be exploited via indirect prompt injection to scan private channels and exfiltrate sensitive data using its authorized API access. The canonical confused deputy plus tool misuse example.

OWASP LLM08, realWorldExamples

Tool Misuse

LLM Email Assistant Exploitation

An agent with send permissions was tricked into forwarding sensitive inbox contents to attacker-controlled email addresses. The agent operated entirely within its authorized permissions.

OWASP LLM08, realWorldExamples

Privilege Compromise

RPA Expense Agent — Privilege Escalation

OWASP threat model scenario: "The attacker crafts a request that forces the RPA agent to escalate its own privileges by exploiting a weak role verification mechanism," switching from a restricted role to admin.

OWASP ASI T&M v1.0a, p. 43

Tool Misuse

RPA Agent — Invoice Data Exfiltration

"The attacker injects a malformed but syntactically valid invoice, tricking the RPA agent into automatically exporting sensitive customer records and emailing them to an attacker-controlled domain."

OWASP ASI T&M v1.0a, p. 43

Tool Misuse

Enterprise Copilot — Misaligned Behavior

"Through Indirect Prompt Injection, an attacker instructs a copilot to activate a custom tool which is then used to exfiltrate data via email and simultaneously sends the user the appropriate email summary."

OWASP ASI T&M v1.0a, p. 41

Code Execution

Hallucinated Package Typosquatting

LLM coding assistants hallucinated non-existent package names that were subsequently typosquatted with malware. No adversarial input required — the model's own hallucination created the attack vector.

OWASP LLM05 & LLM09, realWorldExamples

08 // Architecture Risk Agentic Patterns and Their Tool Misuse Vulnerabilities Patterns

Not all agentic architectures carry the same tool misuse risk. The OWASP ASI document and the Securing Agentic Applications Guide identify specific vulnerability profiles for each common pattern. Your architecture choice is a security decision.

Pattern	Risk Level	Key Vulnerability
Tool Use Pattern	Highest	Direct tool invocation controlled by LLM output
ReAct (Reason + Act)	High	Interleaved reasoning/action cycles amplify injection-to-execution chains
Hierarchical Agent	High	Orchestrator compromise cascades to all sub-agents
Collaborative Swarm	Very High	Peer trust assumptions lead to cascading compromise
Reflection Pattern	Medium	Self-critique loops can be manipulated to justify tool misuse
RAG Pattern	Medium	Poisoned retrieval results can direct tool invocation

The collaborative swarm pattern carries the highest risk profile because peer trust assumptions mean a single compromised agent can cascade across the entire swarm. There is no central orchestrator to enforce policy boundaries. Hierarchical patterns concentrate risk at the orchestrator: if the orchestrator is compromised, every sub-agent inherits the compromise. These are not implementation bugs. They are architectural trade-offs that must be weighed against the operational benefits each pattern provides.

The supply chain dimension adds another layer. MAESTRO L4-T6 identifies that agentic frameworks like LangChain, AutoGen, and CrewAI introduce supply chain risks through compromised dependencies, malicious tool packages, or vulnerable framework versions. As OWASP cautions: "Frameworks are great for proof of concept, but when implemented in production create more dependencies. If you modify a framework to fit your unique needs, then later that framework is patched due to a vulnerability being discovered, you're left needing a rapid update that could have significant impacts on your system."

09 // Defense The Principle of Least Privilege Applied to Agents Mitigations

The defenses converge on a single organizing principle: apply complete mediation — enforce authorization in downstream systems, not in the LLM. OWASP states this explicitly. Security controls must exist at the tool and API level. You cannot rely on the agent to self-police. The model is not a security boundary.

The OWASP Securing Agentic Applications Guide adds a critical design pattern: when an agent executes against an API, database, or entity that has granular user permissions, the agent should assume the permission of the user who invoked it. This "user privilege assumption" pattern enforces granular security controls on the agent, preventing it from returning information the user should not have access to.

Excessive Agency Defenses

Minimize tools to only those necessary for the agent's purpose
Avoid open-ended tools (shell access, unrestricted URL fetching)
Enforce least-privilege permissions on all tool connections
Execute operations in the user's security context, not shared service accounts
Require human approval for high-impact and irreversible actions
Implement rate limiting on tool invocations

Source: OWASP LLM08 Defenses

Tool Misuse Defenses

Enforce strict tool access verification and parameter validation
Monitor tool usage patterns for anomalous sequences
Validate agent instructions against expected behavioral baselines
Set clear operational boundaries with tool-specific rate limits
Implement execution logs tracking all tool calls for anomaly detection

Source: OWASP AGENT-T02 Defenses

MCP Defenses

MCP server allowlisting with cryptographic identity verification
Tool description integrity validation against known-good manifests
Transport-level encryption and authentication for all MCP connections
Tool capability auditing before granting agent access to new MCP servers

Source: CSA MAESTRO L4-T4 Defenses

Privilege Compromise Defenses

Granular RBAC and ABAC with dynamic access validation
Down-scope agent privileges when operating on behalf of users
Time-based restrictions on privilege elevation with automatic downgrade
Block cross-agent privilege delegation unless explicitly authorized

Source: CSA MAESTRO L4-T2 Defenses

10 // Hardening Architecture-Specific Security Actions Patterns

The OWASP Securing Agentic Applications Guide provides architecture-specific security guidance. The defenses you need depend on the agentic pattern you deploy. A single-agent system with kill-switch capabilities has a fundamentally different security profile than a swarm with peer-to-peer trust.

Single-Agent

Hierarchical

Swarm

Single-Agent Systems

For single-agent deployments, the OWASP guide recommends:

Implement monitoring for anomalies: resource monitoring, IO monitoring, and behavioral monitoring
Set up alerts for suspicious events that indicate tool misuse patterns
Build emergency off-switches (kill switches) to immediately revoke access privileges
Implement input validation and content filtering at the agent boundary

Multi-Agent with Central Orchestrator

For hierarchical architectures, the orchestrator is both the control point and the single point of failure:

Inter-agent communication validation at the orchestrator level
Orchestrator-level security monitoring with policy enforcement
Task delegation audit trails tracking every agent-to-agent handoff
Harden the orchestrator as the highest-priority security target

Multi-Agent Swarm Architecture

Swarm patterns have the highest tool misuse risk. Without a central authority, security must be distributed:

Peer-to-peer trust verification before accepting delegated tasks
Distributed monitoring across all swarm participants
Consensus-based security decisions for high-impact actions
Blast radius testing to measure cascade effects when one agent is compromised

The CSA Red Teaming Guide adds concrete testing procedures for each architecture type, including authorization and control hijacking tests, permission escalation testing (checking whether agents retain temporary privileges after task completion), role inheritance exploitation testing, least privilege validation, and separation of control plane from execution environment. For organizations deploying agents that interact with critical systems, the guide recommends physical system manipulation testing, IoT device interaction testing, and impact chain and blast radius assessment to measure how far unauthorized actions propagate through connected systems.

For a deeper look at the broader agentic AI threat landscape including the OWASP, MITRE ATLAS, and CSA MAESTRO frameworks that inform these defenses, see our companion article. For the prompt injection threat that drives many tool misuse scenarios, see Prompt Injection in Agentic Systems.

Ready to test your agent security knowledge? The Agent Blueprint Quest walks you through real architecture decisions including tool scoping, permission models, and orchestration security. For the latest threat intelligence, visit the Security News Center. Organizations subject to regulation will find the compliance implications of excessive agency covered in depth at the EU AI Act Hub and the NIST AI RMF Hub. For enterprise-level governance strategies that address tool misuse at the policy layer, see the AI Governance Hub. Or explore the full Secure pillar for the complete threat defense stack.

◀ Previous Article Prompt Injection in Agentic Systems: Why It's the #1 Threat Back to Pillar ▶ Secure: Agentic AI Threat Defense

Gallery

Contacts

Tool Misuse, Excessive Agency, and the MCP Compositional Risk

Services

Learn

Company