Maximilian Schreiner

MiniMax-M1 comes close to Gemini 2.5 Pro efficiency when handling large context windows

GPT-Image-1 prompted by THE DECODER
MiniMax-M1 comes close to Gemini 2.5 Pro efficiency when handling large context windows
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
The Chinese AI startup MiniMax has released MiniMax-M1, a new open-source language model designed to outperform Deepseek's R1.

MiniMax-M1 is a reasoning-focused model with a massive context window of up to one million tokens and a "thinking" budget of up to 80,000 tokens. The model uses an especially efficient reinforcement learning approach, making it much leaner than other open-source options.

It's available for free under the Apache-2.0 license. In benchmark tests, MiniMax-M1 outperforms other open models like DeepSeek-R1-0528 and Qwen3-235B-A22B in several categories. On the OpenAI MRCR test, which measures complex, multi-step reasoning across long texts, M1's performance comes close to the leading closed model, Gemini 2.5 Pro.

Image: MiniMax

While proprietary models like OpenAI o3 and Gemini 2.5 Pro still hold an edge in some areas, MiniMax-M1 has narrowed the gap significantly. The model is available in two versions on Hugging Face.

China's blooming AI startup scene

MiniMax, a Shanghai-based AI startup, has quickly become a major player in China's growing AI scene. Founded at the end of 2021 and backed by investors like Alibaba, the company focuses on developing advanced language and multimodal models.

Earlier this year, MiniMax released several open-source language models, including MiniMax-Text-01, which can handle up to four million tokens of context - double the capacity of leading models so far. While a larger context window is impressive, researchers caution that more tokens doesn't always mean better accuracy in responses.

MiniMax is also developing multimodal AI systems, including MiniMax-VL-01, which can process both text and images. In September 2024, the company launched abab-video-1 ("Video-01"), a text-to-video model that creates short HD videos with virtual camera movement.

Sources
Arxiv
