Google introduces "implicit caching" in Gemini 2.5, aiming to cut developer costs by as much as 75 percent. The new feature automatically detects and stores recurring content, so repeated prompts are only processed once. According to Google, this can lead to significant savings compared to the old explicit caching method, where users had to set up their own cache. To maximize the benefits, Google recommends putting the stable part of a prompt—like system instructions—at the start, and adding user-specific input, such as questions, afterwards. Implicit caching kicks in for Gemini 2.5 Flash starting at 1,024 tokens, and for Pro versions from 2,048 tokens onwards. More details and best practices are available in the Gemini API documentation.
Ad
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Sources
News, tests and reports about VR, AR and MIXED Reality.
Meta reportedly planning to add facial recognition to Ray-Ban Smart Glasses
Apple reportedly making progress on custom silicon for AI glasses
VR no longer required: Alien Rogue Incursion lands on PS5 and PC this September
MIXED-NEWS.com
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.