Google's VaultGemma shows the struggle to balance privacy and performance in AI

Sep 14, 2025

Google DeepMind has introduced a new language model called VaultGemma, designed with a focus on privacy. It is the largest open model to date trained from scratch with differential privacy, containing 1 billion parameters.

Normally, large language models can memorize parts of their training data, including sensitive information like names, addresses, or entire documents. Differential privacy avoids this by adding controlled random noise during training, making it statistically impossible to trace the model's outputs back to specific examples. In theory, even if VaultGemma were trained on confidential documents, those documents could not be reconstructed later.

According to Google, early tests confirm that the model does not reproduce training data. The tradeoff is performance: its output is roughly comparable to non-private LLMs released about five years ago.

The model weights are openly available on Hugging Face and Kaggle.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Google's VaultGemma shows the struggle to balance privacy and performance in AI

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.