Meta has released more details about the Llama 2 architecture, training efforts, approach to fine-tuning, and more "to enable the community to build on our work and contribute to the responsible development of LLMs," according to the company.
With its new large language model Llama 2, Meta positions itself as an open-source alternative to OpenAI. Microsoft is on board as a partner.
Llama 2 is now freely available for research and commercial use with up to 700 million active users per month. The model comes in three sizes with 7, 13, and 70 billion parameters and was trained with 40 percent more data than Llama v1, according to Meta.
The context length, which is the maximum amount of data available in the AI's short-term memory that the model can process at once, is 4096 tokens, double that of its predecessor and on par with ChatGPT using GPT-3.5.
Compared to Llama v1 and other open-source models, Llama 2 shows better performance in all benchmarks. Especially in the important Massive Multi-Task Language Understanding (MMLU) benchmark, Llama clearly outperforms its predecessor and the open-source competition.
Compared to closed-source models such as GPT-4 and PaLM-2, Meta itself speaks of "a large gap in performance". However, ChatGPT's GPT-3.5 level should be reached by Llama-2 in most cases.
For coding tasks, GPT-4 with code interpreter or specialized models like Starcoder should be ahead according to the benchmarks.
These models [Llama 2] have demonstrated their competitiveness with existing open-source chat models, as well as competency that is equivalent to some proprietary models on evaluation sets we examined, although they still lag behind other models like GPT-4.
From the paper
According to Meta, Llama 2 was trained using publicly available online data sources. The fine-tuned chat model, Llama-2-chat, uses publicly available training datasets and more than a million human annotations. Using the same method, Reinforcement Learning from Human Feedback (RLHF), OpenAI also optimized ChatGPT.
Meta makes the models available for free download on the Llama website after you complete a registration form. Each download comes with the model code, weights, user manual, responsible use guide, acceptable use guidelines, model card, and license.
A free demo version of the chat model with 7 and 13 billion parameters is available on this website.
Meta partners with Microsoft
Somewhat surprisingly, Meta presents the Llama model together with Microsoft, the largest investor in OpenAI. Apparently, Microsoft wants to position itself in both the closed-source and open-source space and make the models available to enterprises through its Azure infrastructure. Meta also offers Llama through Amazon Web Services, Hugging Face, and other providers.
The two companies have a shared history of creating open AI ecosystems and supporting PyTorch - an AI framework co-developed by Meta - on Microsoft Azure, according to the model announcement.
The collaboration also aims to enable immersive experiences for the future of work and gaming in the metaverse. Microsoft first announced Office software for Meta's VR headsets last fall.
In addition, Meta emphasizes the importance of responsible use of AI and provides resources such as red-teaming exercises, a transparency scheme, a responsible use guide, and an acceptable use policy to ensure fair and responsible use of Llama 2.
Meta is also hedging its open-source bet with a series of endorsements from experts who welcome the release of the model, despite the risks. "Responsible and open innovation gives us all a stake in the AI development process, bringing visibility, scrutiny and trust to these technologies. Opening today’s Llama models will let everyone benefit from this technology," the statement reads.
Meta's AI chief Yann LeCun, one of the most renowned researchers in the field of artificial intelligence, celebrates the release of Llama 2 on Twitter, saying, "This is going to change the landscape of the LLM market."
The release confirms some rumors of recent weeks, including that Llama v2 is commercially viable and is expected to slow the growth of OpenAI. Meta itself could put itself in a strategically interesting position by leveraging the open-source movement for its AI ecosystem.