Google cuts Gemini 1.5 Flash prices and adds new PDF features

Aug 9, 2024

Google

Key Points

The price war for AI models continues despite high development and operating costs: Google is cutting the price of its fast Gemini 1.5 Flash AI model by up to 78%.
According to Google, Gemini 1.5 Flash is particularly popular for applications that require high-speed and low latency. The Gemini API and AI Studio now support PDF understanding through text and images using native multimodal capabilities.
In addition, Google has expanded the language understanding capabilities of Gemini 1.5 Pro and Flash to more than 100 languages and opened access to fine-tuning Gemini 1.5 Flash to all developers.

The AI model price war continues as Google cuts prices for its fast Gemini 1.5 Flash by up to 78 percent.

Google announced that input token costs will drop 78 percent to $0.075 per million tokens. Output token costs will drop 71 percent to $0.30 per million tokens for prompts under 128,000 tokens. Similar price reductions apply to longer prompts and caching.

According to Google, Gemini 1.5 Flash is the most popular model for use cases that require high speed and low latency, such as summarization, categorization, and multimodal understanding.

The Gemini API and AI Studio now support better PDF understanding based on text and image analysis. For PDFs that contain graphics, images, or other visual content, the model uses native multimodal processing capabilities.

Expanded language and fine-tuning support

Google has also expanded language support for Gemini 1.5 Pro and Flash models to over 100 languages. This allows developers around the world to work with the models in their preferred language. It should stop the model from blocking responses for using an unsupported language.

In addition, Google is expanding access to Gemini 1.5 Flash fine-tuning. It's now available to all developers through the Gemini API and Google AI Studio. Fine-tuning allows developers to customize base models and improve performance for specific tasks by providing additional data. This reduces the size of the prompt context, lowering latency and cost, and can increase model accuracy.

Google's announcement follows OpenAI's recent price reductions of up to 50 percent for GPT-4o API access. It seems that despite the high cost of developing and running AI models, providers are already engaged in a fierce price war.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Google