Content
summary Summary

The Open Source Initiative (OSI) has released its first formal definition of what constitutes open-source AI.

Ad

The announcement came during the All Things Open 2024 conference, following "multiple years of research and collaboration, an international roadshow of workshops, and a year-long co-design process."

The definition sets clear requirements that many current AI models, including Meta's Llama, don't meet. The one that stands out is that it requires AI model makers to provide enough detail about their training data that a "skilled person can recreate a substantially equivalent system using the same or similar data." This level of transparency goes well beyond what most AI companies currently offer, according to Mozilla AI strategy lead Ayah Bdeir.

At its core, the definition outlines essential freedoms that any open-source AI system must provide. Users need to be able to run the system for any purpose, examine how it works, make modifications, and share it with others. To enable this, companies must release complete information about training data, source code, and model parameters in a format that allows for modifications.

Ad
Ad

The new definition applies to both complete AI systems and individual components like models and weights, aiming to bring the traditional benefits of open source - autonomy, transparency, and collaborative improvement - to the AI field.

Meta's Llama models aren't open enough

The definition directly challenges Meta's claims about its Llama models. While Meta promotes itself as a champion of open AI development, its approach doesn't meet the OSI's criteria, something the organization has repeatedly criticized in the past.

Meta releases its model weights but keeps training data private and places restrictions on commercial use - practices that conflict with fundamental open-source principles. The same is true for Google and its Gemma models.

Meta argues that the high costs and complexity of developing large language models require what it calls a "spectrum of openness." However, skeptics believe Meta may be attempting to exploit loopholes in regulations like the EU AI Act, which offers more lenient treatment for open-source models.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • The Open Source Initiative (OSI) has released its Open Source AI Definition (OSAID), which requires the disclosure of training data and source code for AI models to be considered open source.
  • To meet the new open-source criteria, AI models must provide sufficient information to enable a qualified person to build a substantially equivalent system.
  • Meta's Llama or Google's Gemma models do not qualify as open source under the new definition because Meta releases the model weights but does not provide the training data, and the license restricts the use of the models.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.