Content
summary Summary

French AI startup Mistral AI has unveiled Mixtral 8x22B, a new open-source language model that the company claims achieves the highest open-source performance and efficiency.

The model is a sparse mixture-of-experts (SMoE) model that actively uses only 39 billion of its 141 billion parameters. As a result, the development team claims it offers an exceptionally good price/performance ratio for its size. Its predecessor, Mixtral 8x7B, has been well received by the open-source community.

According to Mistral, Mixtral 8x22B's strengths include multilingualism, with support for English, French, Italian, German, and Spanish, as well as strong math and programming capabilities. It also offers native function calling for using external tools. At 64,000 tokens, the context window is smaller than that of current leading commercial models such as GPT-4 (128K) or Claude 3 (200K).

Open source without restrictions

The Mistral team releases Mixtral 8x22B under the Apache 2.0 license, the most permissive open-source license available. It allows unrestricted use of the model.

Ad
Ad

According to Mistral, the model's sparse use of active parameters makes it faster than traditional densely trained 70-billion-parameter models and more capable than other open-source models.

Image: Mistral AI

Compared to other open models, the Mixtral 8x22B achieves the best results on popular comprehension, logic and knowledge tests such as MMLU, HellaSwag, Wino Grande, Arc Challenge, TriviaQA and NaturalQS.

It also clearly outperforms the 70B LLaMA-2 model in the supported languages - French, German, Spanish and Italian - on the HellaSwag, Arc Challenge and MMLU benchmarks.

Image: Mistral AI

The new model can now be tested on Mistral's "la Plateforme". The open-source version is available at Hugging Face, and is a good starting point for fine-tuning applications, according to Mistral. The model requires 258 gigabytes of VRAM.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • French startup Mistral AI has introduced Mixtral 8x22B, a new open-source language model. It actively uses only 39 billion of the 141 billion parameters and thus achieves a good cost-benefit ratio.
  • The strengths of the multilingual model include particularly strong mathematical and programming capabilities, as well as native function calls. However, at 64,000 tokens, the context window is smaller than that of leading commercial models.
  • In common benchmarks for comprehension, logic and knowledge, as well as in the supported foreign languages, Mixtral 8x22B achieves top scores compared to other open source models. It is now available on the Mistral platform and as an open source version from Hugging Face.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.