Content
summary Summary

FlexOlmo, developed by the Allen Institute for AI, shows that organizations can collaboratively train language models on local datasets without sharing sensitive data.

Ad

In regulated industries, organizations often have valuable data for training AI models but can't share it outside their walls. FlexOlmo takes a different approach, using a mixture-of-experts setup where each expert is trained independently on closed datasets. Instead of exchanging raw data, organizations train their own expert locally and only share the resulting model weights with the group.

The main issue with independently trained experts is coordination. FlexOlmo tackles this by using a frozen public model as a fixed reference. The public expert remains unchanged during training, while new experts are trained on local data. This way, all experts align with the same reference model and can be combined later without additional retraining.

Flexibility for sensitive data

FlexOlmo is well-suited for cases where data access needs to be tightly controlled. Data sources can be activated or deactivated depending on the application. For example, toxic content might be included for research but excluded from general use.

Ad
Ad

The researchers demonstrated this by removing the news expert in a test run. As expected, performance on news-related tasks dropped, but results in other areas remained stable.

Bar chart: Performance (%) on NewsG, MC9, Code, and Math2 in the Full 8 Expert model vs. without News Expert.
When the news expert is removed from FlexOlmo, performance on news tasks drops, but results in other areas stay nearly the same. | Image: Shi et al.

Even if licenses change or usage rights expire, data sources can be deactivated later without retraining the entire model. The final model has 37 billion parameters, with 20 billion active.

Performance gains in real-world tests

The team evaluated FlexOlmo using a mix of public data and seven specialized datasets: News, Creative Writing, Code, Academic Papers, Educational Text, Math, and Reddit content.

When tested on 31 tasks, FlexOlmo showed an average improvement of 41 percent over a model trained only on public data. In general benchmarks, FlexOlmo actually outperformed a hypothetical model that had access to all data with the same computational effort. Only a model trained on the entire dataset with double the resources did slightly better.

Bar chart: FlexOlmo outperforms the public model without FlexOlmo in four tests, just below the 2×FLOPs upper bound.
FlexOlmo's architecture leads to only minor performance drops in more general benchmarks. | Image: Ai2

Because data owners only share trained model weights, the risk of data leakage is minimal. In testing, attacks to recover training data succeeded just 0.7 percent of the time. For organizations with especially sensitive data, FlexOlmo supports differentially private training, which offers formal privacy guarantees. Each participant can enable this option independently. The Allen Institute has also released OLMoTrace, a tool for tracing language model outputs back to their training sources.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • FlexOlmo enables collaborative training of language models by allowing data owners to keep their raw data private, using locally trained expert modules whose model weights are shared instead.
  • The system offers precise control over which data sources contribute to the model, so content can be enabled or disabled for different scenarios without needing to retrain the whole system.
  • In evaluations across 31 partly specialized tasks, FlexOlmo outperformed a model trained only on public data by an average of 41 percent, while the likelihood of training data being extracted from model weights was measured at just 0.7 percent.
Jonathan writes for THE DECODER about how AI tools can improve both work and creative projects.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.