BigScience Bloom launches a new GPT-3 competitor that is much more than just another big language model.
Large-scale language models are among the most successful AI technologies in recent years: large U.S. companies such as OpenAI, Google, Nvidia, or Meta use them for their products or sell access to the text capabilities of AIs.
There are also numerous large-scale language models in China from various companies. In March, for example, researchers at Alibaba Group unveiled a model with 1.93 trillion parameters. The BaGuaLu framework used for training theoretically allows AI models with up to 174 trillion parameters.
Companies in Israel and Europe are also offering language models. Israeli AI startup AI21 Labs recently received $64 million to develop more AI models like Jurassic-1 Jumbo. German company Aleph Alpha launched Luminous and recently announced a collaboration with UK chipmaker Graphcore for more projects.
These giant language models often serve as the basis for customers’ own AI applications, for which they fine-tune the large model with little additional training. The underlying technologies are also used in multimodal systems such as DALL-E 2, Imagen and Parti.
EleutherAI, Hugging Face and Meta release open-source models
But models like OpenAI’s GPT-3 or Google’s LaMDA are well-kept secrets, their code isn’t freely available. Independent researchers have therefore been working for several years on open-source alternatives to open up usage and research access to large-scale language models.
Pioneers include the research collective EleutherAI, which released the 20 billion-parameter GPT-NeoX-20B earlier this year, and AI startup Hugging Face, which enables the development, training and deployment of open-source AI models.
Arguably fueled by these successes, Meta released the 175 billion parameter OPT-175B model in May – but only to researchers and only on demand. It is the largest open language model to date, albeit with limited access.
BigScience Bloom is open science and open source
Now there is a true open-source alternative to GPT-3, BigScience Bloom, which is freely available for research and enterprise purposes. Bloom was trained over 117 days at the supercomputing center of the French National Center for Scientific Research and is 176 billion parameters in size.
The development involved over 1000 volunteer researchers, organized in the BigScience project, coordinated by Hugging Face, and co-funded by the French government.
Bloom can be downloaded for free on Hugging Face and is said to be on par with GPT-3 for accuracy – and also toxicity. A key difference from GPT-3 is a stronger focus on languages away from the otherwise dominant English language.
Bloom can process 46 different languages, including French, Vietnamese, Mandarin, Indonesian, Catalan, 13 Indian languages (such as Hindi) and 20 African languages. BigScience collected numerous new datasets for this and is publishing full details on datasets, development and training of Bloom.
The release falls under the Responsible AI License developed by BigScience, which prohibits the use of Bloom in areas such as law enforcement, healthcare, or deception. However, unlike OpenAI, for example, BigScience has no way to effectively prevent misuse because the model is available directly and not through an interface.
Bloom is now expected to serve as the foundation for numerous applications and, more importantly, research projects that create alternative AI applications away from the big tech companies.