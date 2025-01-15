AI research
Jonathan Kemper

MiniMax introduces AI models with record context length for agents with 'long term memory'

MiniMax
Jonathan works as a freelance tech journalist for THE DECODER, focusing on AI tools and how GenAI can be used in everyday work.
MiniMax, a Chinese AI startup, has released its MiniMax-01 family of open-source models. The company says its MiniMax-Text-01 can handle contexts up to 4 million tokens - double the capacity of its closest competitor.

The new lineup includes two models: MiniMax-Text-01 for text processing and MiniMax-VL-01 for handling both text and visual data. This expanded context window could give AI agents a form of "long-term memory," allowing them to collect, connect, and store information from multiple sources for later use.

"Lightning Attention" increases efficiency

To process such lengthy contexts efficiently, MiniMax uses a hybrid approach. The system combines the "Lightning Attention" mechanism (introduced in 2023 and updated in 2024) with traditional Transformer blocks in a 7:1 ratio. The team says this setup significantly cuts down processing demands for long inputs while keeping the benefits of Transformer architecture.

The model also uses a "Mixture of Experts" (MoE) structure - essentially a layer of specialized sub-models optimized for different tasks. The system picks and combines the most suitable experts based on what it's working with. MiniMax-Text-01 has 32 of these experts, each with 45.9 billion parameters, bringing the total to about 456 billion parameters.

A needle found in the 4 million token haystack

MiniMax has released benchmark tests showing their model performs similarly to top commercial options like GPT-4 and Claude 3.5 Sonnet in standard evaluations.

Säulendiagramm: Leistungsvergleich von 7 KI-Sprachmodellen in 7 Benchmark-Tests, Genauigkeit in Prozent auf Y-Achse dargestellt.
Seven leading language models show different performances in various benchmark tests. MiniMax-Text-01 consistently achieves top results, including in MMLU (88.5%). | Picture: MiniMax

The company says MiniMax-Text-01 particularly shines with long contexts - claiming 100% accuracy in the "Needle-In-A-Haystack" test with 4 million tokens.

However, it's worth noting that Google's year-old Gemini 1.5 Pro, with its 2-million token window, achieved the same perfect score. Researchers have found this benchmark isn't particularly meaningful, and studies suggest that extremely large context windows might not offer real advantages over smaller ones when used with RAG systems.

Heatmap: Durchgehend grüne Visualisierung eines Retrievaltests mit 4M Datenpunkten, zeigt 100% Erfolgsrate über alle Teststufen.
The Needle-In-A-Haystack retrieval test with 4 million data points shows consistently optimal performance. | Image: MiniMax

Models available as open source

Anyone can download the MiniMax-01 models from GitHub and Hugging Face. Users can test them through MiniMax's Hailuo AI chatbot or integrate them via a relatively affordable API.

The company, backed by Alibaba and founded in late 2021, previously made headlines with its Video-01 generator last fall. While MiniMax sees DeepSeek (which recently released its own open-source language model) as a competitor, both companies' models will likely face restrictions from Chinese government censorship.

News, tests and reports about VR, AR and MIXED Reality.

