A DeepMind paper from late April is drawing renewed attention this week as enterprise teams evaluate the same agentic capabilities it studied. The timing matters: the research doesn’t describe a hypothetical future risk. It describes a vulnerability class that exists in web-browsing agents deployed today.
What the research found
DeepMind’s research identifies what it calls “AI Agent Traps”, malicious instructions embedded in HTML, CSS, or page metadata that a web-browsing agent encounters during normal task execution. The mechanism: an agent receives a legitimate user instruction, browses the web to complete it, encounters a page containing hidden instructions, and executes those instructions as if they were part of the original task. The user sees normal output. The hidden instruction runs in the background.
The more significant finding is memory poisoning. According to DeepMind’s research, this attack vector allows malicious instructions to persist across agent sessions, meaning a compromised agent doesn’t just behave badly once. It carries the malicious instruction into subsequent tasks.
For context, the full technical paper is available on arXiv (paper ID 2604.25922, submitted April 2026). Note that paper authorship, whether by DeepMind researchers or independent researchers, has not been confirmed in this reporting cycle. That distinction affects how the findings should be weighted: vendor-authored research about vulnerabilities in general agent systems and independent third-party research carry different evidentiary weight.
On the 86% figure
DeepMind’s research reportedly measured exploit success rates of 86% in controlled testing environments. That figure requires context before it means anything useful. Testing environment conditions, the specific agent architecture, the complexity of the planted instructions, the task types tested, aren’t disclosed in this brief. Real-world exploit rates would depend heavily on those conditions. The 86% figure is a single-source, self-reported benchmark from a controlled setting. It warrants attention without being treated as a production risk probability.
Why this matters now
This week, Anthropic launched financial agents that operate in enterprise environments with web access. The vulnerability class DeepMind identified, hidden instruction exploitation in web-browsing agents, is directly applicable to that category of deployment. Practitioners evaluating these products aren’t just assessing capability. They’re assessing the attack surface those capabilities introduce.
What practitioners should check
Three architectural controls are directly relevant here, drawn from CISA’s published agentic AI guidance:
1. Instruction provenance verification, does the agent validate that instructions originate from authorized sources, or does it execute any instruction it encounters? 2. Session isolation, are instructions from one session prevented from persisting into subsequent sessions? 3. Sandboxing for web-browsing tasks, is the agent’s web-browsing activity isolated from its core task execution context?
See our coverage of CISA’s agentic AI guidance for the full framework these controls map to.
What to watch
Two things. First, whether independent researchers replicate DeepMind’s findings, the 86% figure and the memory persistence claim both need third-party verification before they should anchor enterprise risk assessments. Second, whether Anthropic, OpenAI, or other labs with deployed web-browsing agents publish responses to this vulnerability class.
TJS synthesis
Hidden instruction exploitation isn’t a theoretical vulnerability. It’s a direct consequence of giving agents web access without architectural controls on instruction provenance. The memory poisoning variant is more severe than single-session prompt injection because it compounds: each affected session can propagate the malicious instruction further. Enterprise teams deploying web-browsing agents need to treat instruction provenance as a first-order design requirement, not a future concern.