OLMo 2 32B sets a new standard for true open-source LLMs with public code, weights, and data

A new open-source language model has achieved performance comparable to leading commercial systems while maintaining complete transparency.

The Allen Institute for Artificial Intelligence (Ai2) announced that its OLMo 2 32B model outperforms both GPT-3.5-Turbo and GPT-4o mini while making its code, training data, and technical details publicly available.

The model stands out for its efficiency, consuming only a third of the computing resources needed by similar models like Qwen2.5-32B. This makes it particularly accessible for researchers and developers working with limited resources.

Building a transparent AI system

The development team used a three-phase training approach. The model first learned basic language patterns from 3.9 trillion tokens, then studied high-quality documents and academic content, and finally mastered instruction-following using the Tulu 3.1 framework, which combines supervised and reinforcement learning techniques.

To manage the process, the team created OLMo-core, a new software platform that efficiently coordinates multiple computers while preserving training progress. The actual training took place on Augusta AI, a supercomputer network of 160 machines equipped with H100 GPUs, reaching processing speeds over 1,800 tokens per second per GPU.

Scatter plot with logarithmic x-axis: performance trends (FLOPs) of various AI hardware accelerators compared, latest models at the upper right margin. — While Qwen2.5 and Gemma 3 show better average benchmark performance than OLMo 2 32B, both Alibaba and Google have only released their model weights rather than full open-source implementations. | Image: Ai2

While many AI projects, such as Meta's Llama, claim open-source status, OLMo 2 meets all three essential criteria: public model code, weights, and training data. The team has released everything, including the Dolmino training dataset, enabling complete reproducibility and analysis.

"With just a bit more progress everyone can pretrain, midtrain, post-train, whatever they need to get a GPT 4 class model in their class. This is a major shift in how open-source AI can grow into real applications," says Nathan Lambert of Ai2.

This builds on their earlier work with Dolma in 2023, which helped establish a foundation for open-source AI training. The team has also uploaded various checkpoints, i.e., versions of the language model at different times during training. A paper released in December along with the 7B and 13B versions of OLMo 2 provides more technical background.

The gap between open and closed source AI systems has narrowed to about 18 months, according to Lambert's analysis. While OLMo 2 32B matches Google's Gemma 3 27B in basic training, Gemma 3 shows stronger performance after fine-tuning, suggesting room for improvement in open source post-training methods.

Recommendation

AI research

DeepSeek's latest R1 model matches OpenAI's o1 in reasoning benchmarks

The team plans to enhance the model's logical reasoning and expand its ability to handle longer texts. Users can test OLMo 2 32B through Ai2's Chatbot Playground.

While Ai2 also released the larger Tülu-3-405B model in January that surpasses GPT-3.5 and GPT-4o mini, Lambert explains that it isn't fully open source since the lab wasn't involved in its pretraining.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

OLMo 2 32B sets a new standard for true open-source LLMs with public code, weights, and data

Building a transparent AI system

DeepSeek's latest R1 model matches OpenAI's o1 in reasoning benchmarks

OpenAI postpones open-weight AI until summer due to "unexpected and quite amazing" progress

Google releases open-source LMEval to benchmark language and multimodal models

Stability AI releases a compact open text-to-audio model that runs on mobile devices

New Othello experiment supports the world model hypothesis for large language models

ChatGPT might be draining your brain, MIT warns - what ‘cognitive debt’ means for you

Meta's latest model highlights the challenge AI faces in long-term planning and causal reasoning

OLMo 2 32B sets a new standard for true open-source LLMs with public code, weights, and data

Building a transparent AI system

Share

Bank details