Content
summary Summary
Update
  • OpenLLaMA 13B released

Update June 19, 2023:

Ad

The OpenLLaMA team has released its OpenLLaMA model with 13 billion parameters. It is available on Github. The team notes that the tokenizer used for OpenLLaMA is not suitable for code. A model suitable for code will follow.

Original article from May 5, 2023:

OpenLLaMA is an open-source reproduction of Meta's LLaMA language model and can be used commercially.

Ad
Ad

Since the unveiling of Meta's LLaMA family of large language models and the subsequent leak, the development of open-source chatbots has exploded. Models such as Alpaca, Vicuna, and OpenAssistant use Meta's models as the basis for their various forms of instruction tuning.

However, LLaMA models are licensed for research use only, which prevents commercial use of those models.

OpenLLaMA reproduces Meta's language models

Alternatives based on other freely available models do not match the quality of Meta's models, as LLaMA follows Deepmind's Chinchilla scaling laws and has been trained on particularly large amounts of data.

Researchers at Berkeley AI Research want to replicate Meta's LLaMA models in the OpenLLaMA project. The team is using Together's RedPajama dataset for the project. The open-source platform also announced its intention to reproduce the LLaMA models in April, releasing the 1.2 trillion parameter dataset as a first step.

The Berkeley team is now releasing an early version of the 7-billion-parameter OpenLLaMA model, which has so far been trained on 300 billion of 1.2 trillion tokens. Performance is already said to be approaching the level of LLaMA, and the team is confident that the fully trained OpenLLaMA will be competitive with Meta's original.

Recommendation

OpenLLaMA also comes in 3 billion parameter version

In addition to the 7 billion parameter model, the OpenLLaMA team is also training a 3 billion parameter version to enable the use of powerful language models in use cases with limited resources.

The team has no plans for larger models at this time. Together's LLaMA replica RedPajma is also limited to the 7 billion parameter variant for the time being. The AI model is also currently in training and should have passed the 500 billion token mark by now.

The Alpaca formula or OpenAssistant could move to the soon-to-be-available, fully-trained OpenLLaMA or RedPajama models, making them available for commercial purposes as well, offering businesses for the first time a true open-source alternative to services like OpenAI's ChatGPT.

The first OpenLLaMA model is available on HuggingFace, with more information and code on Github.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • OpenLLaMA is an open-source reproduction of Meta's LLaMA language models that allows commercial use.
  • Berkeley AI Research is releasing an early version of the 7 billion parameter OpenLLaMA model, which approaches the performance of Meta's LLaMA models.
  • The OpenLLaMA team is also developing a 3 billion parameter version for use cases with limited resources.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.