Anthropic's new Claude Sonnet 4.5 continues the trend in large language model development: better coding and the ability to tackle tasks for much longer stretches.

Anthropic says Sonnet 4.5 is now its most capable model, outperforming previous versions in software development, computer operation, and task automation—even surpassing Claude Opus 4.1, which launched in August. Sonnet 4 has only been out for about four months, so release cycles are speeding up, with each update bringing more incremental gains.

The rapid release schedule seems fueled by the rivalry with OpenAI, as both companies compete to offer the best coding model. Anthropic appears to be targeting GPT-5 with each update. Opus 4.1, for example, launched just days before GPT-5 hit the market.

Sonnet 4.5 writes better code and works longer without interruption

In the SWE-bench Verified benchmark, which tests models on real-world programming problems, Anthropic says Claude Sonnet 4.5 delivered the best results of any model so far. The company also reports that, in internal tests, Sonnet 4.5 was able to stay focused on complex tasks for more than 30 hours straight.

OpenAI is moving in a similar direction, betting that language models optimized for logic will be able to handle longer, more complex tasks.

In the OSWorld benchmark, which tests how well models can operate real computer systems, Sonnet 4.5 scored 61.4 percent, up from 42.2 percent with Sonnet 4 just four months ago. This video shows the Claude extension for Chrome in action, with Sonnet 4.5 filling out forms and handling browser tasks.

Along with improvements in programming and computer skills, Sonnet 4.5 also shows gains in math, logical reasoning, and subject-specific knowledge. According to Anthropic, tests with professionals in finance, law, medicine, and STEM found that Sonnet 4.5 performed significantly better than earlier Claude models. Anthropic recommends Sonnet 4.5 for any use case.

Claude Sonnet 4.5 is available now through the Claude API. Pricing stays the same at $3 or $15 per million tokens, keeping Claude among the most expensive options on the market.

Along with the model update, the Claude Code development tool is getting new features, including checkpoints to save and reset task status, a redesigned terminal interface, and a native VS Code extension for smoother integration with development environments.

Claude Agent SDK for building your own AI agents

With the Claude Agent SDK, Anthropic is making the infrastructure it uses to build its own AI agents public for the first time. The SDK is designed to help developers manage long-term tasks, set up authorization systems, and coordinate multiple sub-agents. New tools for memory management and context processing are also available through the Claude API, making it easier to control long-running agent processes.

Alongside the release, Anthropic is launching "Imagine with Claude", a limited-time experiment where Sonnet 4.5 generates software in real time. The demo is available to Max subscribers for five days.

