Content
summary Summary

xAI has introduced Grok 4 Fast, a lighter version of its flagship model. According to the company, it performs on par with Grok 4 in most tasks but uses about 40 percent less compute. That efficiency also translates into lower costs - xAI says the price per task can drop by as much as 98 percent.

Ad

In benchmarks like GPQA Diamond (85.7 percent) and AIME 2025 (92.0 percent), Grok 4 Fast scores close to models like Grok 4 and even GPT-5. The company highlights that the model cuts down on so-called "thinking tokens," using an average of 40 percent fewer tokens to reach similar results. The gap becomes most obvious on complex problems, where other models take more intermediate steps and require more computation.

Earlier versions relied on separate models for simple answers and reasoning-heavy tasks. Grok 4 Fast combines both approaches into one architecture, with its behavior controlled through the system prompt. This fits the broader trend toward hybrid models.

The system has also been trained to use external tools on its own, including web browsing and code execution. On benchmarks like BrowseComp (44.9 percent) and X Bench Deepsearch (74 percent), it outperforms Grok 4. In the LMArena-Search benchmark, it even tops OpenAI's o3-websearch, which previously held the lead. In Text Arena, Grok 4 Fast currently ranks 8th, ahead of other models in a similar size range.

Ad
Ad

A single model for different tasks

Grok 4 Fast is available through grok.com, iOS and Android apps, and the xAI API. It comes in two versions: one optimized for reasoning-heavy work and another for quick answers. Both support a 2-million-token context window. Pricing ranges from $0.05 to $1.00 per million tokens, depending on token type. For now, Grok 4 Fast is also free to use via OpenRouter and Vercel.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • xAI has launched Grok 4 Fast, a streamlined model that matches Grok 4’s performance in most tasks while using around 40 percent less compute, which can reduce task costs by up to 98 percent according to the company.
  • In benchmarks such as GPQA Diamond and AIME 2025, Grok 4 Fast achieved scores close to leading models like Grok 4 and GPT-5, while using fewer "thinking tokens," especially on complex problems that typically require more computation.
  • The model combines simple and reasoning-heavy task handling into one architecture, supports tool use like web browsing and code execution, and is available via grok.com, mobile apps, and the xAI API, with pricing from $0.05 to $1.00 per million tokens and free access through OpenRouter and Vercel.
Sources
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.