Content
summary Summary

Google Deepmind has unveiled Gemma 3, a new generation of open AI models designed to deliver high performance with a relatively small footprint, making them suitable for running on individual GPUs or TPUs.

Ad

The Gemma 3 family includes four models ranging from 1 to 27 billion parameters. Despite their compact size, these models outperform much larger LLMs like Llama-405B and DeepSeek-V3 in initial tests, according to Google Deepmind.

The models can handle more than 140 languages, with 35 requiring no additional training. They process text, images (except for the 1B version), and short videos using a 128,000-token context window. Google says their function calling and structured output capabilities make them well-suited for agentic tasks.

Comparison chart of the Gemma AI models with specifications of size, context length, languages and input/output modalities.
Google's Gemma 3 family includes five AI models ranging from 1 billion to 27 billion parameters, each optimized for different use cases. | Image: Google

All models underwent distillation training followed by specialized post-training using various reinforcement learning approaches. These techniques specifically target improvements in mathematics, chat functionality, instruction following, and multilingual communication.

Ad
Ad

Making LLMs more efficient

For the first time, Google is officially offering quantized versions that reduce memory and computing requirements while maintaining accuracy. The company says Gemma 3 will reproduce less verbatim text than previous versions and avoid reproducing personal data.

Human evaluators in the chatbot arena gave Gemma 3-27B-IT an Elo score of 1338, ranking it among the top 10 AI models. The smaller 4B model performs surprisingly well, matching the capabilities of the larger Gemma 2-27B-IT. The 27B version shows similar performance to Gemini 1.5-Pro across many benchmark tests.

Table comparing the performance of AI models with Elo ratings, confidence intervals and technical specifications.
Human evaluators ranked AI models in the chatbot arena, with Grok-3 and GPT-4.5 achieving the highest Elo scores. Gemma-3-27B-IT placed in the top 10 by March 8, 2025 during testing. | Image: via Google

Alongside the multimodal Gemma models, Google introduced ShieldGemma 2, a specialized 4-billion-parameter security checker designed to identify dangerous content, explicit material, and depictions of violence in images.

Example using Gemma's vision capability: When AI helps to analyze bills, the Zürcher Geschnetzeltes is no longer so heavy on the stomach. | Image: via Google

Gemma 3 models are available through Hugging Face, Kaggle, and Google AI Studio. They support common frameworks including PyTorch, JAX, and Keras. Academics can access $10,000 in cloud credits through the Gemma 3 Academic Program. The models run on NVIDIA GPUs, Google Cloud TPUs, and AMD GPUs, with Gemma.cpp available for CPU use.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google's Deepmind has released a new series of AI models called Gemma 3, available in four sizes ranging from 1 billion to 27 billion parameters, which achieve strong performance despite their more compact size compared to other models.
  • Gemma 3 models offer multilingual support for over 140 languages and can understand and process text, images, and short video clips, as well as handle function calling.
  • The largest model in the series, Gemma 3-27B-IT, ranks among the top ten AI models based on human ratings and delivers results comparable to the larger Gemini 1.5-Pro model across many benchmarks.
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.