Content
summary Summary
Update
  • Added information from the London event

Update from April 10th, 2024:

Meta chief lobbyist Nick Clegg confirmed at a Meta AI event in London that Llama 3 will be released soon, without giving an exact date. Meta plans to release several different models with different capabilities throughout the year, Clegg said.

Meta intends to integrate Llama 3 models into many of its services, including WhatsApp and RayBan smart glasses. Eventually, AI agents should be able to perform specific tasks, such as booking a trip, in addition to answering questions, Clegg said. The company aims to make Meta AI, which uses the Llama models, the most useful AI assistant available.

"We will be talking to these AI assistants all the time," said Yann LeCun, Meta's chief AI researcher. "Our entire digital diet will be mediated by AI systems."

LeCun reiterated that significant logical progress in large-scale language models requires a scientific breakthrough. That breakthrough would allow the model to select the best answer from among possible responses and develop a mental model of the implications of its actions.

Original article from April 9, 2024:

Ad
Ad

Meta to release two smaller versions of its Llama 3 open-source model next week

According to a Meta employee, the company will release two smaller versions of its upcoming large language model Llama 3 next week, The Information reported.

The smaller models are intended to build anticipation for the larger version of Llama 3 this summer, which is scheduled to be released one year after Llama 2 in the summer of 2023.

Since then, competition in the open-source market has intensified significantly. More and more model developers are trying to attract attention by making their models, or a selection of them, available as open source.

In addition to the Llama models, the French model startup Mistral has made a name for itself in the open-source scene with the Mistral 7B and the Mixtral MoE model. Google Deepmind recently entered the open-source market with its Gemma models.

Big tech companies investing in open-source AI hope to make their system a standard for thousands or millions of AI apps, similar to what Google has done with Android.

Recommendation

Llama 3 becomes multimodal

The two smaller models will specialize in text generation. The full Llama 3, planned for the summer, will be multimodal and will also be able to generate images or answer questions about images.

Meta hopes that Llama 3 will catch up to OpenAI's GPT-4. With about 140 billion parameters, the largest version of Llama 3 could be twice as large as Llama 2.

However, the number of parameters only gives a limited indication of the quality of the model. With 314 billion parameters, Elon Musk's Grok-1 is currently the largest open-source mixture-of-experts model.

However, its performance is only on par with OpenAI's GPT-3.5 or Mistral's much smaller Mixtral model with 56 billion parameters. Mistral's 7B model was also able to beat larger Llama models.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Content changes are also likely: In recent months, the Meta team has worked to make Llama 3 more open to answering controversial questions. Meta leaders felt that the answers in Llama 2 were too cautious. Llama 3 could be more responsive to the user and provide more context for difficult questions.

Meta has recently invested heavily in AI and is one of the top customers for Nvidia's graphics chips. Meta CEO Mark Zuckerberg plans to have about 600,000 graphics cards in use for AI training by the end of the year. Meta is also developing its own AI chip, Artemis.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Meta will soon begin releasing its Llama 3 language model. Several releases are planned throughout the year. One focus will be on AI assistants that can also perform actions.
  • Unlike previous versions that specialized in text generation, Llama 3 will be multimodal, with about 140 billion parameters, twice the size of its predecessor. It will also be able to deal more openly with controversial topics.
  • The open-source scene for language models has grown considerably since the release of Llama 2 about a year ago. In addition to Meta, Mistral, and Google Deepmind, many smaller model builders have entered the market.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.