Google introduces "implicit caching" in Gemini 2.5, aiming to cut developer costs by as much as 75 percent. The new feature automatically detects and stores recurring content, so repeated prompts are only processed once. According to Google, this can lead to significant savings compared to the old explicit caching method, where users had to set up their own cache. To maximize the benefits, Google recommends putting the stable part of a prompt—like system instructions—at the start, and adding user-specific input, such as questions, afterwards. Implicit caching kicks in for Gemini 2.5 Flash starting at 1,024 tokens, and for Pro versions from 2,048 tokens onwards. More details and best practices are available in the Gemini API documentation.
