AI in practice
Maximilian Schreiner

xAI releases cheaper, fast language model Grok 4 Fast

xAI
xAI releases cheaper, fast language model Grok 4 Fast
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Profile
E-Mail
Content
summary Summary

xAI has introduced Grok 4 Fast, a lighter version of its flagship model. According to the company, it performs on par with Grok 4 in most tasks but uses about 40 percent less compute. That efficiency also translates into lower costs - xAI says the price per task can drop by as much as 98 percent.

Ad

In benchmarks like GPQA Diamond (85.7 percent) and AIME 2025 (92.0 percent), Grok 4 Fast scores close to models like Grok 4 and even GPT-5. The company highlights that the model cuts down on so-called "thinking tokens," using an average of 40 percent fewer tokens to reach similar results. The gap becomes most obvious on complex problems, where other models take more intermediate steps and require more computation.

Earlier versions relied on separate models for simple answers and reasoning-heavy tasks. Grok 4 Fast combines both approaches into one architecture, with its behavior controlled through the system prompt. This fits the broader trend toward hybrid models.

The system has also been trained to use external tools on its own, including web browsing and code execution. On benchmarks like BrowseComp (44.9 percent) and X Bench Deepsearch (74 percent), it outperforms Grok 4. In the LMArena-Search benchmark, it even tops OpenAI's o3-websearch, which previously held the lead. In Text Arena, Grok 4 Fast currently ranks 8th, ahead of other models in a similar size range.

Ad
Ad

A single model for different tasks

Grok 4 Fast is available through grok.com, iOS and Android apps, and the xAI API. It comes in two versions: one optimized for reasoning-heavy work and another for quick answers. Both support a 2-million-token context window. Pricing ranges from $0.05 to $1.00 per million tokens, depending on token type. For now, Grok 4 Fast is also free to use via OpenRouter and Vercel.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • xAI has launched Grok 4 Fast, a streamlined model that matches Grok 4’s performance in most tasks while using around 40 percent less compute, which can reduce task costs by up to 98 percent according to the company.
  • In benchmarks such as GPQA Diamond and AIME 2025, Grok 4 Fast achieved scores close to leading models like Grok 4 and GPT-5, while using fewer "thinking tokens," especially on complex problems that typically require more computation.
  • The model combines simple and reasoning-heavy task handling into one architecture, supports tool use like web browsing and code execution, and is available via grok.com, mobile apps, and the xAI API, with pricing from $0.05 to $1.00 per million tokens and free access through OpenRouter and Vercel.
Sources
xAI
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Profile
E-Mail
AI in practice

OpenAI taps Apple talent and suppliers for AI hardware push

News, tests and reports about VR, AR and MIXED Reality.
What happens next with MIXED My personal farewell to MIXED Meta and Anduril are now jointly developing XR headsets for the US military MIXED-NEWS.com
AI in practice

AI artist Xania Monet signs $3 million record deal with Hallwood Media

AI in practice

OpenAI plans $100 billion in extra spending on reserve servers

Google News
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

xAI releases cheaper, fast language model Grok 4 Fast

Bank details

IBAN: DE88 2507 0070 0053 0014 00
BIC: DEUTDE2HXXX
Account holder: Deep Content GmbH
Purpose: Support THE DECODER
AI research

OpenAI outperforms humans and Google at the world's top collegiate programming contest

AI in practice

New data from OpenAI and Anthropic show how people actually use ChatGPT and Claude

AI and society

Leading AI chatbots are now twice as likely to spread false information as last year, study finds

Google News