Claude Code's new Auto Mode tries to balance safety and speed
Key Points
- Anthropic is adding an "Auto Mode" to its AI coding tool Claude Code that sits between approving every action manually and turning off all safety checks entirely.
- A classifier running on Claude Sonnet 4.6 evaluates every command before execution: local file operations run automatically, while risky actions like external deployments or mass deletions get blocked. If blocks pile up, the system switches back to manual approval.
- Anthropic is clear that the mode reduces risk but doesn't eliminate it. Auto Mode is currently available as a research preview for Team plan users, with Enterprise and API access expected to follow shortly.
Developers using Claude Code have faced an awkward choice: approve every single action manually or turn off all safety checks entirely. Anthropic's new Auto Mode aims to offer a middle ground.
Claude Code can execute shell commands, delete files, create directories, and push commits to GitHub. By default, the tool asks for permission before every potentially risky action. That protects against damage but seriously disrupts workflow. Many developers end up using the "dangerously-skip-permissions" option, which bypasses all safety checks and can lead to "dangerous and destructive outcomes," according to Anthropic.
The new Auto Mode is designed to fix this. Before each action, a separate classifier checks whether a command is potentially destructive. Safe actions run automatically, while risky ones get blocked. Claude then tries to find an alternative approach. If that fails repeatedly - three blocks in a row or twenty total - the system switches back to manual approval.
A classifier that separates local work from external risks
The classifier runs on Claude Sonnet 4.6 and evaluates actions based on conversation context. According to the technical documentation, it deliberately doesn't see tool results. This is meant to prevent malicious content in files or web pages from manipulating the classifier.
By default, the classifier blocks downloading and running external scripts, sending sensitive data to external endpoints, production deployments, mass deletions on cloud storage, and force pushes. Local file operations in the working directory, installing already-declared dependencies, and read-only HTTP requests are allowed through.
Risk is reduced, not eliminated
Anthropic is clear that Auto Mode reduces risk but doesn't eliminate it. The classifier can let risky actions through when context is ambiguous or incorrectly block harmless ones. The company still recommends running Claude Code in sandboxed environments.
Auto Mode is currently available as a research preview for Team plan users and works with the Sonnet 4.6 and Opus 4.6 models. Enterprise and API access are expected to follow in the coming days, according to Anthropic.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now