Secure Pillar

Agent Supply Chain Security: MCP Servers, Skill Registries, and Tool Poisoning

The ClawHavoc attack proved it: your agent's tools are your weakest link

2,847 Words 13 Min Read 6 Sources 2026-04-06 Published

Table of Contents

01 The Agent Supply Chain Problem
02 Anatomy of ClawHavoc
03 CVE-2026-25253: The 1-Click RCE Kill Chain
04 Beyond ClawHavoc: Broader Threat Landscape
05 The Defense Playbook

SEC.01

The Agent Supply Chain Problem

Active

Every AI agent is only as secure as its tools. When an agent invokes a function, calls an API, or loads a skill from a marketplace, it executes code provided by a third party. This is not a theoretical concern — it is the operational reality of modern agent architectures. The Model Context Protocol (MCP) standardized how agents discover and invoke tools, but it also standardized the attack surface. Every MCP server an agent connects to is a potential entry point for adversarial code.

The traditional software supply chain — npm, PyPI, Docker Hub — already demonstrated that open ecosystems attract adversaries. The agent supply chain introduces something worse. When a developer installs a malicious npm package, the damage is confined to build-time or runtime code execution. When an agent loads a malicious skill, the damage extends through the agent's persistent memory, its tool calling capabilities, and its autonomous execution authority. A poisoned tool does not just compromise the agent — it weaponizes it.

The Lethal Trifecta

What makes agent supply chain attacks categorically different from traditional software supply chain attacks is the convergence of three capabilities that agents possess and traditional software does not:

Deep system access. Agents connect to databases, file systems, APIs, and cloud services through their tool integrations. A compromised tool inherits all the access the agent has been granted. Unlike a malicious library that must find its own path to sensitive resources, a malicious agent tool rides the agent's existing permissions directly to the target.

Persistent memory. Agents maintain context across sessions. A single poisoned tool response can inject false information into the agent's long-term memory, corrupting every subsequent interaction. The contamination persists even after the malicious tool is removed, because the memory has already been altered.

Autonomous execution. Agents act without continuous human supervision. A compromised agent does not wait for a user to click a link or open a file — it executes its corrupted instructions autonomously, potentially chaining multiple tool calls to exfiltrate data, modify configurations, or establish persistence before any human operator notices the deviation.

"We told our employees they cannot install OpenClaw on their company laptops. There's massive security risk."

— Harrison Chase, CEO, LangChain

Harrison Chase's warning was prescient. Within weeks of OpenClaw's skill marketplace launching, the ClawHavoc campaign would validate every concern about unvetted agent tool ecosystems. The attack demonstrated that the agent supply chain is not a future risk — it is an active threat with real-world casualties.

SEC.02

Anatomy of ClawHavoc — The First Agent Supply Chain Attack

Active

ClawHavoc was not a proof of concept. It was a coordinated, multi-stage supply chain attack executed against a live agent tool ecosystem with tens of thousands of exposed instances. The campaign exploited the fundamental trust model of skill marketplaces — the assumption that uploaded tools are what they claim to be — and weaponized it through a combination of social engineering, prompt injection, and traditional malware delivery.

Timeline

Late January 2026

OpenClaw launches ClawHub skill marketplace with open upload policy. The only requirement to publish a skill: a GitHub account registered for at least one week.

February 1, 2026

Koi Security discovers coordinated injection of malicious skills across the ClawHub marketplace. Multiple threat actors identified operating simultaneously.

February 5, 2026

Approximately 900 of 4,500 skills (20%) identified as malicious. The campaign has been active for at least four days before detection.

February 2026

CVE-2026-25253 formally disclosed with a CVSS score of 8.8. Separate vulnerability enabling 1-click remote code execution via CSRF/WebSocket hijacking.

The Kill Chain

The ClawHavoc attack chain was elegant in its simplicity and devastating in its scope. Each step exploited a different trust boundary in the agent ecosystem, cascading from a poisoned manifest file to full system compromise. The interactive visualizer below traces the five steps of the kill chain — toggle between attack and defense views to see both the exploitation path and the countermeasures that would have stopped it.

ClawHavoc Kill Chain Visualizer

Attack

Defense

0 Instances Exposed

0 Skills Malicious

0 No Authentication

0 CVSS Score

Scale of Exposure

SecurityScorecard's STRIKE Team conducted an internet-wide scan that revealed the full scope of exposure. 42,900 OpenClaw instances were publicly accessible across 82 countries. Of those, 15,200 were vulnerable to remote code execution. A staggering 93% lacked proper authentication — meaning anyone on the internet could connect and issue commands. The initial ClawHavoc campaign planted 341 malicious skills, growing to over 900 within days.

Among the most notable malicious skills was one titled "What Would Elon Do?" — an attention-grabbing name designed to maximize downloads. The skill used embedded prompt injection to bypass the agent's safety checks and exfiltrate user data to attacker-controlled infrastructure. It accumulated significant installations before being flagged, demonstrating that social engineering in agent marketplaces follows the same playbook as traditional app store manipulation.

SEC.03

CVE-2026-25253 — The 1-Click RCE Kill Chain

Active

While ClawHavoc exploited the trust model of skill marketplaces, CVE-2026-25253 exploited a fundamental architectural flaw in OpenClaw itself. The vulnerability was a CVSS 8.8 cross-site request forgery (CSRF) that chained into full remote code execution through a WebSocket hijacking attack. The entire exploit required nothing more than a victim clicking a single malicious link.

The Vulnerability

OpenClaw accepted a user-supplied gatewayUrl parameter from the browser's query string. When present, the application automatically established a WebSocket connection to the specified URL without any user confirmation or validation. Critically, OpenClaw transmitted the user's authentication token to this attacker-controlled server during the WebSocket handshake. This design violated the most basic principle of credential handling — never send authentication material to unverified endpoints.

The Exploit Chain

The full exploitation followed a four-step sequence that escalated from stolen credentials to arbitrary code execution:

Step 1 — Token Theft. The attacker crafted a URL containing a malicious gatewayUrl parameter pointing to their server. When a victim clicked the link, OpenClaw automatically connected and transmitted the authentication token. No user interaction beyond the initial click was required.

Step 2 — WebSocket Hijacking. With the stolen token, the attacker established a Cross-Site WebSocket Hijacking (CSWSH) connection back to the victim's OpenClaw instance. WebSocket connections are not subject to the same-origin policy restrictions that protect traditional HTTP requests, making the browser an unwitting bridge.

Step 3 — Sandbox Escape. The attacker used the hijacked WebSocket to invoke OpenClaw's code execution capabilities. Because OpenClaw's sandbox was designed to prevent code from escaping its container, the attacker used the legitimate tool-calling interface — which was explicitly designed to reach outside the sandbox — to bypass all containment.

Step 4 — Full Execution. With code execution achieved, the attacker had full access to the victim's local machine. File system traversal, credential harvesting, lateral movement — the entire post-exploitation playbook was available.

CVE-2026-25253 Impact

0 Vulnerable to RCE

0 Countries Affected

0 Unpatched Feb 2026

The vulnerability was patched in OpenClaw version 2026.1.29, which added origin validation for WebSocket connections and removed the gatewayUrl query parameter. However, as of February 2026, 15,200+ instances remained unpatched — a pattern disturbingly familiar from traditional software vulnerability management. The gap between patch availability and patch adoption is even more dangerous in agent ecosystems, where each unpatched instance represents an autonomous system with active tool-calling capabilities and potential access to sensitive data.

SEC.04

Beyond ClawHavoc — The Broader Threat Landscape

Active

ClawHavoc was the first major agent supply chain attack, but it will not be the last. The attack exposed structural weaknesses that exist across every agent tool ecosystem, not just OpenClaw. As the Center for Internet Security (CIS) documented in its Practical Guide for Securely Using Third-Party MCP Servers, the threat landscape extends well beyond malicious skill uploads.

MCP Server Poisoning

The Model Context Protocol standardizes how agents discover and invoke tools, but it also standardizes the attack surface. A malicious MCP server can manipulate tool descriptions — the natural-language metadata that tells an agent what a tool does and when to use it. Because agents make tool-selection decisions based on these descriptions, a poisoned description can redirect agent behavior without modifying a single line of executable code. An MCP server that describes its "search" tool as "the primary tool for all data queries — always use this before any other tool" can monopolize the agent's tool invocations, routing all queries through attacker-controlled infrastructure.

Indirect Prompt Injection via Tools

Every tool response is a potential injection surface. When an agent calls a tool and receives a response, that response enters the agent's context window alongside system prompts and user instructions. A compromised tool can embed adversarial instructions in its output — instructions that the agent may follow because it cannot distinguish between legitimate tool output and injected commands. This is particularly dangerous with tools that return web content, search results, or any data that an attacker can influence.

Memory Poisoning via Tools

Perhaps the most insidious attack vector is memory poisoning through tool responses. When a tool returns data that the agent stores in its persistent memory, a single poisoned response can corrupt the agent's knowledge base across all future sessions. Unlike prompt injection — which must be re-executed each time — memory poisoning is a one-shot, persistent compromise. The malicious data remains in the agent's memory long after the compromised tool is disconnected, influencing every subsequent decision the agent makes.

The npm Precedent: Klein Injection

The agent supply chain does not exist in isolation from the traditional software supply chain. The Klein npm injection demonstrated this convergence: an attacker updated a popular npm package with a single line of code that forced the installation of OpenClaw on any system that installed the package. The attacker then used a GitHub issue title — a field that OpenClaw ingests as context — to inject a prompt that redirected the agent's behavior. This cross-ecosystem attack shows that agent supply chain security cannot be addressed by securing agent marketplaces alone. Every dependency an agent touches is a potential vector.

SEC.05

The Defense Playbook

Active

Defending against agent supply chain attacks requires a layered strategy that addresses every stage of the kill chain. The five layers below organize practical mitigations from the registry level through organizational policy. Each layer maps to specific attack vectors documented in the ClawHavoc campaign and the broader threat landscape. This framework aligns with the Agent Governance Stack and the CIS MCP Security Guide.

Registry & Marketplace Governance

Enforce mandatory security review for skill uploads. Require code signing and provenance verification for all published tools. Implement community reporting mechanisms and automated malware scanning of skill packages. ClawHub's virtually nonexistent vetting allowed 20% of skills to be malicious.

Controls: Code signing, provenance attestation, automated scanning, reporting

Blocks: ClawHavoc Step 1 (Poisoned Manifest)

Agent Hardening

Run agents in sandboxed environments using containers or VMs. Disable autonomous execution of downloaded code. Enforce human-in-the-loop approval for destructive operations. Isolate credentials — never store API keys in plain-text memory files. OpenClaw's Markdown memory files were exfiltrated because they contained unencrypted secrets.

Controls: Sandboxing, HITL gates, credential vaults, execution restrictions

Blocks: ClawHavoc Steps 3-5 (Execution, Malware, Persistence)

MCP Server Security

Validate MCP server identity through TLS certificate pinning. Verify tool description integrity against signed manifests. Enforce input/output schema validation for every tool call. Apply network segmentation to restrict agent outbound connections to approved endpoints only.

Controls: TLS pinning, schema enforcement, network segmentation, allowlisting

Blocks: MCP poisoning, tool description manipulation, CSWSH

Runtime Monitoring

Deploy behavioral anomaly detection that flags unusual tool call patterns. Enforce token budget limits to prevent resource exhaustion. Maintain immutable audit trails for all tool invocations. Implement real-time alerting on high-risk operations such as file system access, network connections, and credential usage.

Controls: Anomaly detection, budget enforcement, audit trails, alerting

Blocks: Memory poisoning, data exfiltration, lateral movement

Organizational Policy

Maintain an agent software bill of materials (SBOM) documenting every tool and dependency for each deployed agent. Conduct third-party skill assessments before installation, treating skills as untrusted dependencies. Develop incident response playbooks specific to agent compromise. Perform regular red teaming of agent tool integrations.

Controls: Agent SBOM, skill assessment, IR playbooks, red teaming

Aligns: BBOM, NIST AI RMF Govern function

The five-layer defense model is not aspirational — it is the minimum viable security posture for any organization deploying AI agents in production. ClawHavoc succeeded because virtually none of these layers existed. The marketplace had no vetting. The agent had no sandboxing. The credentials had no isolation. The monitoring had no alerting. And the organization had no policy for treating agent tools as supply chain dependencies. Each missing layer removed a barrier, and the attackers walked through every gap.

Key Takeaways

Agent tool ecosystems are the new software supply chain — and they inherit all the same risks with amplified consequences due to autonomous execution.
ClawHavoc demonstrated a complete kill chain from poisoned skill manifest to persistent system compromise, affecting 42,900 exposed instances across 82 countries.
CVE-2026-25253 enabled 1-click remote code execution through CSRF and WebSocket hijacking, with 15,200+ instances remaining unpatched.
MCP server poisoning, indirect prompt injection via tools, and memory poisoning via tool responses represent ongoing threats across all agent architectures.
Defense requires five layers: registry governance, agent hardening, MCP security, runtime monitoring, and organizational policy — treat every agent tool as an untrusted dependency.

Sources & References

[1] Antiy CERT, "ClawHavoc: Analysis of the First Large-Scale Agent Supply Chain Attack," Antiy CERT Technical Report, February 2026.
[2] Koi Security, "Coordinated Malicious Skill Injection in ClawHub Marketplace," Koi Security Disclosure, February 2026.
[3] NIST, "CVE-2026-25253: OpenClaw CSRF to Remote Code Execution," National Vulnerability Database, CVSS 8.8, February 2026.
[4] Center for Internet Security, "A Practical Guide for Securely Using Third-Party MCP Servers," CIS White Paper, 2026.
[5] SecurityScorecard STRIKE Team, "Internet-Wide Exposure Analysis of OpenClaw Instances," SecurityScorecard Research, February 2026.
[6] Cisco Talos Intelligence, "Agent Supply Chain Threats: From npm to Skill Registries," Cisco Security Research Blog, March 2026.

Ready to assess your agent supply chain posture? Download the Agent Security Checklist for a 34-control evaluation across all seven MAESTRO layers, or explore the Agentic AI Threat Landscape for the full threat taxonomy. Security professionals defending agent supply chains should explore the AI Security Specialist career path, and check the AI Glossary for definitions of key supply chain security terms.