Ad
Skip to content

Cursor takes on OpenAI and Anthropic with Composer 2, a code-only model built to match rivals at a fraction of the cost

Image description

Key Points

  • With Composer 2, Cursor has released its own AI model for software development at $0.50 per million input tokens and $1.50 per million output tokens, significantly cheaper than Claude Opus 4.6 ($5.00/$25.00) and GPT-5.4 ($2.50/$15.00).
  • The code-specialized model scores 61.3 on Cursor's internal CursorBench, a major improvement over its predecessor Composer 1.5 (44.2) and competitive with Claude Opus 4.6 (58.2) and GPT-5.4 Thinking (63.9).
  • Building its own model is a strategic necessity for Cursor: the company competes directly with Anthropic and OpenAI while simultaneously depending on their models, leaving it with limited pricing flexibility as both providers offer low-cost flat-rate plans.

Cursor releases Composer 2, the second generation of its own AI model for software development. The model aims to match the leading coding models from Anthropic and OpenAI at a fraction of the cost.

The model is now available in Cursor and in the early alpha of the new "Glass" interface. Pricing starts at 0.50 dollars per million input tokens and 2.50 dollars per million output tokens. A faster variant that Cursor says delivers identical intelligence costs 1.50 and 7.50 dollars per million tokens respectively, and ships as the default.

Model Price per 1 million tokens, input / output Hint
Composer 2 0.50 / 2.50 dollars Standard version
Composer 2 Fast 1.50 / 7.50 dollars Faster version with the same intelligence according to the cursor
Claude Opus 4.6 5.00 / 25.00 dollars API price according to Anthropic, valid for any context length
GPT-5.4 2.50 / 15.00 dollars, short context; 5.00 / 22.50 dollars, long context OpenAI price depending on context length

On pure API pricing, Cursor Composer 2 comes in well below both Claude Opus 4.6 and GPT-5.4. Even the faster variant still undercuts both competitors by a wide margin on token costs.

Co-founder Aman Sanger told Bloomberg the model was trained exclusively on code data. That narrow focus made it possible to build a smaller, more cost-effective model. "It won't help you do your taxes. It won't be able to write poems," Sanger said.

Ad
DEC_D_Incontent-1

Reinforcement learning on long coding tasks drives quality gains

According to Cursor, the quality improvements over its predecessor come down to a stronger first pass of continued pretraining, which provides a better foundation for the reinforcement learning that follows. Training runs on so-called long-horizon coding tasks, programming challenges that require hundreds of individual actions to complete.

The numbers Cursor published show a major jump, especially compared to earlier Composer versions. On CursorBench, the company's internal benchmark for coding tasks, Composer 2 climbs from 44.2 (Composer 1.5) to 61.3. The model also posts gains on Terminal Bench 2.0, a benchmark for agent-based tasks in the terminal, and on SWE-bench Multilingual, which tests software engineering tasks across multiple programming languages.

Model CursorBench Terminal Bench 2.0 Terminal Bench 2.0 optimized SWE-bench Multilingual
Composer 2 61.3 61.7   73.7
Composer 1.5 44.2 47.9   65.9
Composer 1 38.0 40.0   56.9
Claude Opus 4.6 58.2 58.0 65.4 77.8
GPT 5.4 Thinking 63.9 75.1   N/A

Terminal Bench 2.0 scores aren't directly comparable across the board, since results also depend on the agent, harness, and settings. For Claude Opus 4.6, 58.0 is the Claude Code value; 65.4 is an additional optimized value published by Anthropic. For GPT-5.4 Thinking, only a single published Terminal Bench value is available.

Building its own model is about survival, not just performance

Cursor competes directly with Anthropic and OpenAI, both of which are shipping increasingly powerful AI models for software development. According to Bloomberg, Cursor now has more than one million daily users and around 50,000 enterprise customers. The company is also in talks about a new funding round at a valuation of roughly 50 billion dollars.

Ad
DEC_D_Incontent-2

At the same time, Cursor faces a structural dilemma. The platform still supports models from OpenAI and Anthropic, which means it's competing with the very providers whose technology it has relied on. As long as Cursor buys third-party models, its pricing, performance, and margins all depend on companies that sell directly to the same customers.

Anthropic, in particular, is making aggressive moves in the coding market with Claude Code. Cursor reportedly estimates that a single Claude Code subscription at 200 dollars a month can rack up around 5,000 dollars in actual compute costs. That highlights the structural problem: when you build on someone else's model, you're paying full price for compute that the model provider can heavily subsidize in its own product.

That doesn't leave Cursor much room. According to the report, consumer subscriptions are already running at negative margins, with enterprise contracts carrying the business. And the longer-term risk may be even bigger—as AI coding agents get more capable, users might skip the IDE entirely and work with these systems straight from the model provider.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Cursor | Bloomberg