An independent government safety body has published what may be the most specific public evidence to date that a frontier AI model can execute autonomous, multi-step enterprise cyberattacks.
The UK AI Safety Institute, the government-funded independent evaluator that has assessed frontier models since its 2023 launch, reportedly published its evaluation of Anthropic’s Claude Mythos Preview this week. According to press coverage of the report, Mythos completed an average of 22 of 32 steps in a corporate network takeover simulation. In 3 of 10 attempts, it reportedly achieved full network takeover. These specific figures come from press coverage of the AISI report rather than the report itself, the primary source URL is pending resolution, and should be confirmed against the official evaluation document when it becomes available.
What isn’t in doubt: the UK AISI is an independent government evaluator. Its findings aren’t a vendor benchmark or a press release. They represent Tier 2 evaluation in the benchmark hierarchy, independent reproduction, sitting above self-reported vendor claims. When a government safety institute publishes attack capability findings, the framing of a story changes from “what a company says its model can do” to “what a government body says it confirmed.”
The UK government’s response reportedly escalated quickly. Reports indicate the Business Secretary issued an emergency open letter urging corporate leaders to review and harden their cyber defenses in light of the findings. That public advisory, if confirmed, marks a different posture than the US approach to the same model: earlier reporting established that Anthropic briefed the Trump administration on Mythos privately. The US briefing was not followed by a public advisory.
AISI reportedly characterized frontier model cyber capabilities as doubling every four months. That specific figure carries significant policy weight, but it hasn’t been independently verified, it’s drawn from press coverage of the evaluation and is pending Epoch AI compute tracking data. Treat it as a reported claim, not an established trajectory.
What to watch: Whether the AISI report surfaces publicly with an official government URL, which would let security teams and compliance professionals read the methodology rather than work from press summaries. The transatlantic divergence in how the US and UK governments handle sensitive capability findings is itself a developing story with direct implications for how multinational enterprises manage AI security governance.
The TJS read: Security teams shouldn’t wait for the AISI report URL to plan. The core finding, that a frontier model with restricted access has demonstrated autonomous multi-step attack capability under independent evaluation, is consistent with the direction Mythos-class capabilities have been heading based on prior public reporting. Whether the figures are exactly 22/32 or somewhat different, the policy and security planning implications are the same: autonomous cyber AI is past the proof-of-concept stage, and access governance frameworks that assumed these capabilities were theoretical need to be revisited.