summary Summary

Meta has released more details about the Llama 2 architecture, training efforts, approach to fine-tuning, and more "to enable the community to build on our work and contribute to the responsible development of LLMs," according to the company.


With its new large language model Llama 2, Meta positions itself as an open-source alternative to OpenAI. Microsoft is on board as a partner.

Llama 2 is now freely available for research and commercial use with up to 700 million active users per month. The model comes in three sizes with 7, 13, and 70 billion parameters and was trained with 40 percent more data than Llama v1, according to Meta.

Image: Meta

The context length, which is the maximum amount of data available in the AI's short-term memory that the model can process at once, is 4096 tokens, double that of its predecessor and on par with ChatGPT using GPT-3.5.

Compared to Llama v1 and other open-source models, Llama 2 shows better performance in all benchmarks. Especially in the important Massive Multi-Task Language Understanding (MMLU) benchmark, Llama clearly outperforms its predecessor and the open-source competition.

Llama 2 in the benchmark with open source models. | Image: Meta

Compared to closed-source models such as GPT-4 and PaLM-2, Meta itself speaks of "a large gap in performance". However, ChatGPT's GPT-3.5 level should be reached by Llama-2 in most cases.

GPT-4 and Google's PaLM are still ahead of Llama 2. | Image: Meta

For coding tasks, GPT-4 with code interpreter or specialized models like Starcoder should be ahead according to the benchmarks.

These models [Llama 2] have demonstrated their competitiveness with existing open-source chat models, as well as competency that is equivalent to some proprietary models on evaluation sets we examined, although they still lag behind other models like GPT-4.

From the paper

According to Meta, Llama 2 was trained using publicly available online data sources. The fine-tuned chat model, Llama-2-chat, uses publicly available training datasets and more than a million human annotations. Using the same method, Reinforcement Learning from Human Feedback (RLHF), OpenAI also optimized ChatGPT.

Meta's RLHF process: the chat model was refined using human feedback. Using this method, OpenAI has made ChatGPT a successful product. | Image: Meta

Meta makes the models available for free download on the Llama website after you complete a registration form. Each download comes with the model code, weights, user manual, responsible use guide, acceptable use guidelines, model card, and license.

A free demo version of the chat model with 7 and 13 billion parameters is available on this website.


Meta partners with Microsoft

Somewhat surprisingly, Meta presents the Llama model together with Microsoft, the largest investor in OpenAI. Apparently, Microsoft wants to position itself in both the closed-source and open-source space and make the models available to enterprises through its Azure infrastructure. Meta also offers Llama through Amazon Web Services, Hugging Face, and other providers.

The two companies have a shared history of creating open AI ecosystems and supporting PyTorch - an AI framework co-developed by Meta - on Microsoft Azure, according to the model announcement.

The collaboration also aims to enable immersive experiences for the future of work and gaming in the metaverse. Microsoft first announced Office software for Meta's VR headsets last fall.

In addition, Meta emphasizes the importance of responsible use of AI and provides resources such as red-teaming exercises, a transparency scheme, a responsible use guide, and an acceptable use policy to ensure fair and responsible use of Llama 2.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Meta is also hedging its open-source bet with a series of endorsements from experts who welcome the release of the model, despite the risks. "Responsible and open innovation gives us all a stake in the AI development process, bringing visibility, scrutiny and trust to these technologies. Opening today’s Llama models will let everyone benefit from this technology," the statement reads.

Meta's AI chief Yann LeCun, one of the most renowned researchers in the field of artificial intelligence, celebrates the release of Llama 2 on Twitter, saying, "This is going to change the landscape of the LLM market."

The release confirms some rumors of recent weeks, including that Llama v2 is commercially viable and is expected to slow the growth of OpenAI. Meta itself could put itself in a strategically interesting position by leveraging the open-source movement for its AI ecosystem.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • Meta and Microsoft jointly introduce Llama 2, a powerful next-generation open-source AI model to drive innovation and safety in AI. The model can be used commercially.
  • Llama 2 will be available through multiple providers, including the Azure AI Model Catalog, Amazon Web Services, and Hugging Face. It can also be easily downloaded.
  • Meta emphasizes the responsible use of AI and provides resources and guidelines for the safe and fair use of Llama 2.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.