Reasoning tasks sharply raise AI costs, according to a new analysis by Artificial Analysis. Google's Gemini Flash 2.5 costs 150 times more to run than Flash 2.0, due to using 17 times more tokens and charging $3.50 per million output tokens with reasoning, compared to $0.40 for the earlier model. This makes Flash 2.5 the most expensive model in terms of token use for logic. OpenAI's o4-mini costs more per token but used fewer tokens overall, making it cheaper in the benchmark.

Ad
Bar chart titled “Cost to Run Artificial Analysis Intelligence Index.” It shows total U.S. dollar costs to complete all tests in the Artificial Analysis Intelligence Index using different AI models. Bars are split into three colors: Input (blue), Reasoning (purple), Output (green). On the left are the most expensive models: GPT-3 ($1,951), Claude 3 Opus ($1,485), Gemini 2.5 Pro ($844). In the middle: Gemini 2.5 Flash with reasoning ($445), o4-mini (high) ($323). On the right are the cheapest models: Gemini 2.0 Flash ($3), Llama 3 8B ($2). A purple arrow above highlights the cost gap between Gemini 2.0 Flash and Gemini 2.5 Flash with reasoning, labeled “150x.” Source: Artificial Analysis.
Google's Gemini Flash 2.5 costs 150 times more to run with reasoning enabled than Flash 2.0, due to higher token use and pricing. | Image: Artificial Analysis
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.