Ad
Skip to content

METR says it can barely measure Claude Mythos, Palo Alto Networks warns of autonomous AI attackers

Image description
Nano Banana Pro prompted by THE DECODER

The evaluation organization METR is hitting the limits of its test methodology when measuring Claude Mythos' capabilities. Meanwhile, Palo Alto Networks warns that frontier models like Mythos are fundamentally reshaping the cybersecurity landscape.

METR's testing framework can't keep up with Mythos

METR, which specializes in AI risk assessment, evaluated an early version of Claude Mythos Preview during a limited time window in March 2026. The organization estimates a 50 percent time horizon of at least 16 hours, with a 95 percent confidence interval of 8.5 to 55 hours.

That metric describes the task length at which the model has a 50 percent chance of completing a task that would take a human the specified amount of time. METR uses a range of reference points for task length, such as training a classifier (around 45 minutes) or training an adversarially robust image model (around four hours).

According to METR, this value for Mythos is "at the upper end of what we can measure without new tasks." Of the 228 tasks in the test suite, only five are classified as 16 hours or longer. That makes measurements in this range "unstable and less meaningful than at ranges with better task coverage." METR therefore doesn't provide precise estimates for models above this threshold.

The time horizons of AI models are increasing exponentially. Mythos Preview is the first model to land in the unreliable measurement range above 16 hours. | Image: METR (CC-BY)

The organization notes that its existing test suite "could still distinguish a much more capable model from current publicly-known state-of-the-art models." But the measurements in this range aren't robust enough for precise quantitative comparisons or extrapolations.

METR is working on updated methods with longer tasks, though these are still in development. The real security risk may be that evaluation methods are growing more slowly than the models themselves.

Palo Alto Networks calls latest frontier LLMs "a step-change in capability"

Cybersecurity company Palo Alto Networks assessed the risks of frontier models like Claude Mythos from a security perspective. The company says it had "early, unbounded access to the latest frontier AI models" lately, including Mythos, OpenAI's GPT-5.5-Cyber, and Claude Opus 4.7.

Palo Alto Networks describes what it observed as "a step-change in capability." The models showed an "intuitive understanding of software vulnerabilities," shifting AI's role from assistant to autonomous agent "capable of discovering and chaining flaws at a scale that most defenders aren’t prepared for."

According to the company's blog post, three weeks of model-based analysis matched an entire year of manual penetration testing, with broader coverage. In some cases, the models combined several individually low-rated vulnerabilities into critical attack paths. The time from initial access to data exfiltration can shrink to as little as 25 minutes in AI-supported scenarios.

Frontier models are crossing the threshold to autonomous operators

Palo Alto Networks puts the coding efficiency improvement of current frontier models over their predecessors at around 50 percent. "That number sounds incremental, but in practice, it's the threshold at which AI crosses from a helpful assistant into an autonomous operator," the company writes.

The company sees an additional risk in the rapidly growing, unmonitored attack surface, since "every desktop is effectively a server" as local AI agents become more common. At the same time, most organizations have no visibility into the code their own employees are generating and deploying.

After the Mythos launch, the company initially predicted a six-month window before attackers would gain access to comparable capabilities. That assessment, Palo Alto Networks says, has "accelerated significantly."

Independent research confirms a higher threat level, but the scope remains unclear

Anthropic's Claude Mythos triggered cybersecurity hype in part because the company described the model as "too dangerous" to release, a PR tactic OpenAI already used with GPT-2.

Previous studies agree that the cybersecurity threat posed by more capable AI models has increased. But the actual scope of that threat is still unclear.

The British AI Security Institute (AISI) found that Claude Mythos Preview could end-to-end network attacks but assumes this will initially only affect weak, unprotected networks. OpenAI's GPT-5.5, which has already shipped, reportedly solves similar multi-stage corporate attack simulations as well—even slightly above Mythos' level. Smaller AI models are also said to have comparable capabilities.

The models can help with defense, too. Mozilla used Anthropic's Claude Mythos Preview to uncover security vulnerabilities in the Firefox browser. In April 2026 alone, Mozilla fixed a total of 423 security issues, a record, according to the company.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Read on for the full picture.
Subscribe for hype-free coverage.

  • Access to all THE DECODER articles.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder