Vendor claim and external critic, same week, same project. Here’s what each one actually says.
On June 4, Anthropic’s research blog described Claude-powered agents as having achieved what it characterizes as an end-to-end recursive self-improvement milestone, AI systems autonomously conducting research, identifying capability gaps, and generating improvements to their own processes without step-by-step human direction. The hub’s June 6 coverage already addressed the compliance gap this creates; what’s new from Anthropic on June 4 is the milestone framing itself.
RSI claims deserve careful reading. Anthropic’s research blog describes Claude-powered agents as having [the stated capability], that’s a vendor characterization of the agents’ behavior, not an independently verified capability assessment. “End-to-end” RSI in the research literature carries specific technical meaning; Anthropic’s framing of their agents’ behavior as RSI should be evaluated against independent evaluation when it becomes available, not taken at face value from a vendor announcement. The Hub’s June 6 brief on the recursive self-improvement compliance gap established why independent evaluation matters here, the standards frameworks haven’t caught up to what vendors are claiming.
Schneier’s challenge. On June 8, Bruce Schneier published his analysis of the Glasswing patching pipeline on Schneier on Security. Schneier’s argument is specific: there’s a documented gap between the point at which vulnerabilities are discovered in Glasswing-connected critical infrastructure systems and the point at which patches are actually deployed across all nodes. He describes this as a patching deficit, and it’s not a small one.
Warning
Schneier's patching deficit and Anthropic's RSI milestone describe the same operational reality from opposite directions. AI agents improving their own processes require continuous security evaluation. A patching pipeline with a documented deployment lag can't provide that. The compliance gap here isn't theoretical, it's the documented space between what Anthropic is claiming and what Schneier says the infrastructure can actually manage.
Schneier’s position is attributed expert analysis. It’s not independently verified data on patch deployment rates, those figures aren’t public. But Schneier is a recognized security authority, and his characterization of the gap is grounded in the structural reality of how Glasswing is deployed. The project now spans power grids, hospitals, and water systems across multiple jurisdictions. The hub’s prior coverage of the Glasswing coordination chain established the governance complexity of that deployment footprint. Schneier’s critique is that the patching pipeline hasn’t kept pace with the expansion.
Why these two claims are in tension. Anthropic frames Claude-powered agents as capable of autonomous research and self-improvement. Schneier frames the operational infrastructure those agents are embedded in as having a patching deficit. If both are accurate, the combination is a specific kind of risk: AI systems improving their own capabilities deployed in critical infrastructure that can’t patch fast enough to evaluate what they’ve improved.
That’s not a hypothetical. That’s the described situation as of June 8.
The part nobody mentions in Anthropic’s RSI announcement: if AI agents are genuinely improving their own processes, the security evaluation of those agents needs to be continuous, not point-in-time. A patch deployed two weeks after a vulnerability is discovered is already evaluating a system that may have changed. Schneier’s patching deficit critique and Anthropic’s RSI milestone claim are each individually notable. Together, they describe an evaluation gap that no current compliance framework addresses, the hub’s June 6 piece documented that gap in detail.
What to Watch
What to watch. Whether Anthropic publishes technical documentation for the RSI milestone that allows independent evaluation, vendor claims without that documentation can’t be validated. Whether Schneier’s patching deficit analysis prompts a Glasswing governance response, the project’s oversight structure would need to address the patch deployment timeline explicitly. And whether any regulatory body begins examining the intersection of autonomous AI capability improvement and infrastructure patch cadence as a compliance requirement.
Don’t treat Anthropic’s RSI milestone framing as a capability confirmation until independent evaluation exists. Don’t dismiss Schneier’s patching deficit critique because it doesn’t come with raw deployment data, his structural analysis of the mismatch between Glasswing’s expansion footprint and its patching pipeline is consistent with what the hub has documented about Glasswing’s governance complexity.