Mistral Large 2 just one day after Llama 3 signals the LLM market is getting redder by the day

French AI company Mistral AI has released Large 2, a new language model that aims to deliver similar performance to Meta's just-released Llama 3 while being more efficient.

Mistral Large 2, the second generation of Mistral AI's flagship model, boasts significant improvements in code generation, mathematics, logic, multi-language support, and function calling over its predecessor.

With a 128,000-token context window, Large 2 supports dozens of languages, including French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean. It also handles over 80 programming languages such as Python, Java, C, C++, JavaScript, and Bash.

Mistral says Large 2 sets new standards in "performance/cost of serving ratio." In the widely cited Massive Multi-task Language Understanding (MMLU) benchmark, the pre-trained version achieves 84.0% accuracy, setting a record "on the performance/cost Pareto front of open models".

For coding tasks, Large 2 significantly outperforms its predecessor and rivals leading models such as GPT-4o, Claude 3.5 Sonnet, and Llama 3 405B.

With Large 2, Mistral joins the LLM top group in code generation. | Picture: Mistral

Notably, it achieves this with only about a quarter of the parameters (123B vs. 405B) compared to Llama 3.

Less parameters, similar performance: Large 2 should beat Meta's Llama 3 in code and math. | Image: Mistral

One focus of development has been to improve reasoning and minimize the model's tendency to "hallucinate" plausible-sounding but factually incorrect or irrelevant information. Large 2 has been optimized to be more cautious and critical in its responses, admitting when it cannot find a solution or lacks sufficient information to provide a confident answer.

This emphasis on accuracy is reflected in its improved performance on math tasks, the company says, though it doesn't set records in this area.

Performance on the GSM8K (8-shot) and MATH (0-shot, no CoT) math benchmarks. | Picture: Mistral

The model also features improved function calling and information retrieval capabilities, as it has been trained to reliably execute both parallel and sequential function calls. This should allow Large 2 to serve as the foundation for complex business applications, Mistral says.

Recommendation

AI in practice

Meta takes on OpenAI's GPT-4o with Llama 3 405B, its largest open-source LLM to date

Mistral 2 seems to have the edge when it comes to function calling, at least for the next few days. | Picture: Mistral

Mistral Large 2 is now available through the Mistral platform, Azure AI Studio, Amazon Bedrock, IBM watsonx.ai, and Google Vertex AI.

The weights for the instructional model are available for download (228GB) and hosted on HuggingFace under the Mistral Research License, which permits use and modification for research and non-commercial purposes.

For commercial use with own hosting a Mistral Commercial License is required. With 123 billion parameters, the model is designed for inference on a single node with long context applications.

Mistral Large 2 heats up LLM competition as market turns into a red ocean

The rapid release of Mistral Large 2, just one day after Llama 3 405B, indicates the LLM market is becoming increasingly competitive. More models are vying for the same customers and applications with similar performance. For months, the focus has been on efficiency and pricing, with costs dropping sharply while development expenses remain high.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

The key question is whether a model provider can break out of this intense competition by significantly upgrading reasoning capabilities, expanding existing business areas, and opening new ones. If not, the young market may soon face a serious test in living up to the high valuations set by investors.

Mistral Large 2 just one day after Llama 3 signals the LLM market is getting redder by the day

Meta takes on OpenAI's GPT-4o with Llama 3 405B, its largest open-source LLM to date

Mistral Large 2 heats up LLM competition as market turns into a red ocean

Perplexity's valuation soared to $18 billion after its latest funding round

OpenAI CEO Sam Altman warns users not to trust ChatGPT agent with sensitive or personal data

Anthropic appears to tighten the usage limits for Claude code

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Mistral Large 2 just one day after Llama 3 signals the LLM market is getting redder by the day

Mistral Large 2 heats up LLM competition as market turns into a red ocean

Share

Bank details