AI in practice

Nov 29, 2023Nov 29, 2023

Mini-LLM Zephyr-7B keeps pace with 70 billion parameter models

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Profile

E-Mail

Hugging Face has developed the highly optimized Zephyr-7B mini-language model based on Mistral 7B, an open-source model from European start-up Mistral AI. The model was refined using a method called Distilled Supervised Fine-Tuning (dSFT), which uses the output of a larger "teacher" model to train a smaller "student" model. The Distilled Direct Preference Optimization (dDPO) method uses AI feedback from a set of teacher models as preference data, significantly reducing training time and resources required. Zephyr-7B is just ahead of Mistral 7B in benchmarks and can even come close to Llama-2 with 70 billion parameters. You can test the model here in chat.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:

Bank transfer

Sources

Hugging Face Paper

Matthias Bastian

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Profile

E-Mail

AI and society

Jul 11, 2025Jul 11, 2025

The startup behind Manus AI shuts down its entire China team to reduce geopolitical risks

News, tests and reports about VR, AR and MIXED Reality.

What happens next with MIXED My personal farewell to MIXED Meta and Anduril are now jointly developing XR headsets for the US military MIXED-NEWS.com

AI and society

Jul 11, 2025Jul 11, 2025

EU's Model Documentation Form makes AI providers explain their models like it's tax season

AI research

Jul 11, 2025Jul 11, 2025

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Google News

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Mini-LLM Zephyr-7B keeps pace with 70 billion parameter models

The startup behind Manus AI shuts down its entire China team to reduce geopolitical risks

EU's Model Documentation Form makes AI providers explain their models like it's tax season

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

AI coding can make developers slower even if they feel faster

Musk unveils Grok 4 as xAI’s new AI model that beats OpenAI and Google on major benchmarks

"Cat attack" on reasoning model shows how important context engineering is

Mini-LLM Zephyr-7B keeps pace with 70 billion parameter models

The startup behind Manus AI shuts down its entire China team to reduce geopolitical risks

EU's Model Documentation Form makes AI providers explain their models like it's tax season

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models