AI in practice

Mini-LLM Zephyr-7B keeps pace with 70 billion parameter models

Matthias Bastian
Widescreen illustration of Zephyrus, the Greek god of the west wind, depicted with an intensified technical computer style and more pronounced glitch effects. The image shows Zephyrus with a digitized appearance, resembling a figure emerging from a computer screen. His form is composed of pixelated patterns and digital lines, creating a strong sense of a technical glitch. Swirling winds and breezes around him are visualized as streams of binary code and abstract digital elements, enhancing the technical theme. The background is a fusion of ancient Greek art and complex digital patterns, emphasizing the blend of mythology and modern digital art.

DALL-E 3 prompted by THE DECODER

Hugging Face has developed the highly optimized Zephyr-7B mini-language model based on Mistral 7B, an open-source model from European start-up Mistral AI. The model was refined using a method called Distilled Supervised Fine-Tuning (dSFT), which uses the output of a larger "teacher" model to train a smaller "student" model. The Distilled Direct Preference Optimization (dDPO) method uses AI feedback from a set of teacher models as preference data, significantly reducing training time and resources required. Zephyr-7B is just ahead of Mistral 7B in benchmarks and can even come close to Llama-2 with 70 billion parameters. You can test the model here in chat.

Sources: