Ad
Skip to content

AI agents face an uncomfortable truth where security and usefulness are in direct competition

Image description
Anthropic / NBPro prompted by THE DECODER

Key Points

  • Security researchers at LayerX have discovered a critical vulnerability in Anthropic's Claude Desktop Extensions (DXT): a manipulated Google Calendar entry can execute arbitrary code on a user's computer without any interaction required.
  • According to LayerX, Anthropic has chosen not to fix the issue for now. The company's reasoning: the behavior is consistent with the intended design, which prioritizes maximum autonomy and cooperation between extensions.
  • The case highlights a fundamental tension at the heart of AI agents: safety and usefulness are in direct competition with each other.

Security researchers have found a critical vulnerability in Anthropic's Claude Desktop Extensions. A single manipulated Google Calendar entry can execute arbitrary code on a user's computer, no interaction required. Anthropic says it has no plans to fix the issue.

Security firm LayerX has uncovered a critical vulnerability in Claude Desktop Extensions (DXT). Attackers can use a single Google Calendar entry to run arbitrary code on a victim's computer without the victim knowing or having to confirm anything.

The vulnerability scored a 10 out of 10 on the Common Vulnerability Scoring System (CVSS) scale, a widely used industry standard for rating how dangerous security flaws are, making this as severe as it gets. According to LayerX, it affects more than 10,000 active users and 50 DXT extensions.

Claude Desktop Extensions are add-on programs available through Anthropic's marketplace, built on the Model Context Protocol (MCP), an open standard developed by Anthropic that lets AI models connect to external tools and data sources. The extensions link Claude to services like Google Calendar, email, or local tools on the computer, and work a lot like browser add-ons with one-click installation.

Ad
DEC_D_Incontent-1

But unlike browser extensions, which run in an isolated environment with no direct access to the operating system, DXT extensions operate without any isolation and with full system privileges, according to LayerX. They can read any file, execute system commands, pull stored credentials, and change OS settings. LayerX describes them as "privileged execution bridges" between Claude's language model and the local operating system.

Claude freely combines harmless and dangerous tools

According to the security researchers, the real problem is how Claude independently decides which installed extensions to combine. When a user makes a request, Claude picks and chains tools on its own to get the job done.

LayerX says there are no built-in security mechanisms to stop data from a harmless service like Google Calendar from being passed directly to a local tool with code execution rights. There's no clear boundary between what's safe and what can cause damage.

The attack LayerX documented doesn't even require special tricks, obfuscation, or hidden instructions. The whole thing starts with a harmless user prompt: "Please check my latest events in Google Calendar and then take care of it for me."

Ad
DEC_D_Incontent-2

A human assistant would read that as a request to manage appointments. Claude, on the other hand, interpreted the vague phrase "take care of it" as a reason to execute local code through an extension, according to LayerX.

All the attack needs is a calendar entry titled "Task Management" containing two instructions: download code from a specific URL and run it on the computer. No confirmation dialog pops up, and no further user interaction is required. The result: an attacker gains full control over the victim's machine.

Anthropic won't fix the flaw—by design

LayerX reported the vulnerability to Anthropic. But according to the security researchers, the company decided not to fix the issue. The reasoning: the behavior is consistent with the intended design, which prioritizes maximum autonomy and cooperation between extensions. A fix would limit the AI agent's ability to freely combine tools, reducing its usefulness.

LayerX's recommendation is blunt: until meaningful safeguards are in place, MCP extensions should not be used on systems where security matters. "A calendar event should never be able to compromise an endpoint," writes security researcher Roy Paz.

From "check my calendar" to full system takeover in six panels: A LayerX comic sums up just how easy the attack is. | Image: LayerX

AI agents keep choosing power over safety

This case fits a long-standing tension between AI capabilities and cybersecurity. Udo Schneider, Governance, Risk & Compliance Lead Europe at Trend Micro, points out that current language models simply can't tell the difference between content and instructions. Everything the model receives is just text. The same mechanisms that enable creative responses also make the system vulnerable to following instructions from outside sources.

AI agents only make things worse - they're more complex and act with more autonomy. Anthropic's Claude Cowork agent and the hyped OpenClaw agent have already demonstrated exactly this kind of risk, along with many similar systems.

Schneider points to a security rule: agents should only use two out of four capability classes at the same time—external communication, access to sensitive data, processing untrusted content, and long-term storage. In practice, though, agents often use all four because it makes them more powerful.

"The more capabilities are used, the higher the risk. However, if this risk is taken consciously and in a controlled manner, there is little to be said against it," Schneider says. "The only problem is that a lot of things are used without rhyme or reason as part of the hype."

Anthropic's deliberate decision not to patch this issue confirms exactly this conflict, according to Schneider: when it comes to AI agents, security and usefulness are in direct competition.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: LayerX