Ad
Short

Reasoning tasks sharply raise AI costs, according to a new analysis by Artificial Analysis. Google's Gemini Flash 2.5 costs 150 times more to run than Flash 2.0, due to using 17 times more tokens and charging $3.50 per million output tokens with reasoning, compared to $0.40 for the earlier model. This makes Flash 2.5 the most expensive model in terms of token use for logic. OpenAI's o4-mini costs more per token but used fewer tokens overall, making it cheaper in the benchmark.

Bar chart titled “Cost to Run Artificial Analysis Intelligence Index.” It shows total U.S. dollar costs to complete all tests in the Artificial Analysis Intelligence Index using different AI models. Bars are split into three colors: Input (blue), Reasoning (purple), Output (green). On the left are the most expensive models: GPT-3 ($1,951), Claude 3 Opus ($1,485), Gemini 2.5 Pro ($844). In the middle: Gemini 2.5 Flash with reasoning ($445), o4-mini (high) ($323). On the right are the cheapest models: Gemini 2.0 Flash ($3), Llama 3 8B ($2). A purple arrow above highlights the cost gap between Gemini 2.0 Flash and Gemini 2.5 Flash with reasoning, labeled “150x.” Source: Artificial Analysis.
Google's Gemini Flash 2.5 costs 150 times more to run with reasoning enabled than Flash 2.0, due to higher token use and pricing. | Image: Artificial Analysis
Short

SoundCloud changed its terms of use in February 2024 to allow uploaded music to be used for AI training. AI copyright activist Ed Newton-Rex spotted the change and said that users were not informed. In a statement, SoundCloud said it does not train AI models with artist content, does not build its own AI tools, and blocks third-party scraping. It said AI is only used internally for things like recommendations, fraud detection, and content sorting. Artists keep control of their work, and all AI use follows existing license deals. However, the statement does not clearly rule out general AI training.

Ad
Short

Google is now using AI models to protect Chrome users from online scams. On desktop, the company has rolled out its local Gemini Nano language model to quickly spot fraudulent websites, including ones that have never been seen before. On Android, Chrome will now warn users about suspicious notifications sent by websites. Google says these changes are part of a broader effort to improve security, which also includes the "Enhanced Protection" feature in Safe Browsing. The company reports that AI-powered systems in Google Search block hundreds of millions of scam results every day, cutting the number of fake airline support pages by more than 80 percent.

Ad
Short

Google introduces "implicit caching" in Gemini 2.5, aiming to cut developer costs by as much as 75 percent. The new feature automatically detects and stores recurring content, so repeated prompts are only processed once. According to Google, this can lead to significant savings compared to the old explicit caching method, where users had to set up their own cache. To maximize the benefits, Google recommends putting the stable part of a prompt—like system instructions—at the start, and adding user-specific input, such as questions, afterwards. Implicit caching kicks in for Gemini 2.5 Flash starting at 1,024 tokens, and for Pro versions from 2,048 tokens onwards. More details and best practices are available in the Gemini API documentation.

Google News