xAI releases cheaper, fast language model Grok 4 Fast

Sep 20, 2025

xAI

xAI has introduced Grok 4 Fast, a lighter version of its flagship model. According to the company, it performs on par with Grok 4 in most tasks but uses about 40 percent less compute. That efficiency also translates into lower costs - xAI says the price per task can drop by as much as 98 percent.

In benchmarks like GPQA Diamond (85.7 percent) and AIME 2025 (92.0 percent), Grok 4 Fast scores close to models like Grok 4 and even GPT-5. The company highlights that the model cuts down on so-called "thinking tokens," using an average of 40 percent fewer tokens to reach similar results. The gap becomes most obvious on complex problems, where other models take more intermediate steps and require more computation.

Earlier versions relied on separate models for simple answers and reasoning-heavy tasks. Grok 4 Fast combines both approaches into one architecture, with its behavior controlled through the system prompt. This fits the broader trend toward hybrid models.

The system has also been trained to use external tools on its own, including web browsing and code execution. On benchmarks like BrowseComp (44.9 percent) and X Bench Deepsearch (74 percent), it outperforms Grok 4. In the LMArena-Search benchmark, it even tops OpenAI's o3-websearch, which previously held the lead. In Text Arena, Grok 4 Fast currently ranks 8th, ahead of other models in a similar size range.

A single model for different tasks

Grok 4 Fast is available through grok.com, iOS and Android apps, and the xAI API. It comes in two versions: one optimized for reasoning-heavy work and another for quick answers. Both support a 2-million-token context window. Pricing ranges from $0.05 to $1.00 per million tokens, depending on token type. For now, Grok 4 Fast is also free to use via OpenRouter and Vercel.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

xAI releases cheaper, fast language model Grok 4 Fast

A single model for different tasks

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.