Content
summary Summary

OpenAI has launched a new initiative called OpenAI Data Partnerships. The goal is to build AI models that deeply understand all subjects, industries, cultures, and languages.

Ad

Big AI models learn skills and aspects of the world by interpreting the data they are trained on. To create an AGI that is safe and useful for all of humanity, AI models need a rich training dataset, OpenAI writes.

By incorporating diverse content, AI models could be better able to understand specific domains, which is crucial for their practical applications.

Data diversity is crucial

OpenAI is already working with several partners, including the Icelandic government and the non-profit Free Law Project, who are interested in representing data from their country or sector. The Free Law Project's goal is to improve access to legal knowledge.

Ad
Ad

OpenAI is particularly interested in large datasets that reflect human society and are not already easily accessible to the public. The data can be text, images, audio, or video. Of particular interest is data that expresses human intent, regardless of language, subject, or format.

There are currently two ways to work with OpenAI:

1. Open-source archive: the goal is to create an open-source language training dataset that is publicly available and can be used to train AI models. OpenAI will investigate how this dataset can be used to safely train other open-source models.

2. Private datasets: For organizations that want to keep their data private but still want AI models to better understand their domain, OpenAI prepares private datasets for training proprietary AI models, including base models and fine-tuned custom models. The company says it handles the data with the level of sensitivity and access controls desired by the partner.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • OpenAI has launched a new initiative called OpenAI Data Partnerships. The goal is to work with different organizations to create public and private datasets for training AI models.
  • The goal is AI that understands all subjects, industries, cultures, and languages. OpenAI has already partnered with the Icelandic government and the non-profit Free Law Project to improve AI's ability to speak Icelandic and democratize access to legal knowledge.
  • OpenAI offers two partnership options: creating open-source datasets for public use to train AI models, or providing private datasets to train your own AI models.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.