- OpenLLaMA 13B released
Update June 19, 2023:
The OpenLLaMA team has released its OpenLLaMA model with 13 billion parameters. It is available on Github. The team notes that the tokenizer used for OpenLLaMA is not suitable for code. A model suitable for code will follow.
Original article from May 5, 2023:
OpenLLaMA is an open-source reproduction of Meta's LLaMA language model and can be used commercially.
Since the unveiling of Meta's LLaMA family of large language models and the subsequent leak, the development of open-source chatbots has exploded. Models such as Alpaca, Vicuna, and OpenAssistant use Meta's models as the basis for their various forms of instruction tuning.
However, LLaMA models are licensed for research use only, which prevents commercial use of those models.
OpenLLaMA reproduces Meta's language models
Alternatives based on other freely available models do not match the quality of Meta's models, as LLaMA follows Deepmind's Chinchilla scaling laws and has been trained on particularly large amounts of data.
Researchers at Berkeley AI Research want to replicate Meta's LLaMA models in the OpenLLaMA project. The team is using Together's RedPajama dataset for the project. The open-source platform also announced its intention to reproduce the LLaMA models in April, releasing the 1.2 trillion parameter dataset as a first step.
The Berkeley team is now releasing an early version of the 7-billion-parameter OpenLLaMA model, which has so far been trained on 300 billion of 1.2 trillion tokens. Performance is already said to be approaching the level of LLaMA, and the team is confident that the fully trained OpenLLaMA will be competitive with Meta's original.
As a part of our effort to replicate LLaMA in an open-source manner, we are pleased to announce the release of preview of the 7B OpenLLaMA model that has been trained with 200 billion tokens on the RedPajama dataset.https://t.co/jsMn9ZlaN0
— Hao Liu (@haoliuhl) May 2, 2023
OpenLLaMA also comes in 3 billion parameter version
In addition to the 7 billion parameter model, the OpenLLaMA team is also training a 3 billion parameter version to enable the use of powerful language models in use cases with limited resources.
The team has no plans for larger models at this time. Together's LLaMA replica RedPajma is also limited to the 7 billion parameter variant for the time being. The AI model is also currently in training and should have passed the 500 billion token mark by now.
The Alpaca formula or OpenAssistant could move to the soon-to-be-available, fully-trained OpenLLaMA or RedPajama models, making them available for commercial purposes as well, offering businesses for the first time a true open-source alternative to services like OpenAI's ChatGPT.