Meta reportedly prepares to launch Llama 4 after delays and internal overhaul

Apr 5, 2025

GPT-4o prompted by THE DECODER

Key Points

Meta's Llama 4 launch has been delayed multiple times due to its underwhelming performance in areas such as logical reasoning, math, and voice interactions, and being surpassed by competitor Deepseek; the release is now slated for the end of the month.
Meta is developing its own API for enterprise clients through the "Llama X" project, along with a dedicated technical and sales team, aiming to diversify revenue streams and reduce reliance on cloud services.
Despite internal debates, Meta has decided to implement the "Mixture of Experts" architecture in at least one model within Llama 4, a move that was accelerated by competitive pressure from Chinese startup Deepseek.

Meta’s next language model, Llama 4, is reportedly nearing release after repeated delays and significant changes to its development process.

According to The Information, the launch is now planned for the end of the month—assuming there are no further setbacks. Initial versions of the model reportedly struggled with benchmarks in logical reasoning, mathematics, and natural-sounding dialogue. These shortcomings led Meta to revise both the technical architecture and project management approach for Llama 4.

Building an enterprise API to expand commercial use

To support broader enterprise adoption, Meta is developing its own application programming interface (API) for Llama. The internal project, known as "Llama X," is led by Chief Strategy Officer David Wehner. Alongside the API, Meta plans to build a dedicated team across engineering, sales, and marketing to support its business clients directly.

Until now, developers have used Llama either by running the open-source model themselves or accessing it through third-party providers like AWS. Meta’s own API would mark a shift toward establishing a standalone commercial offering built on its language models.

Meta plans to invest up to $65 billion in AI infrastructure this year. The Information previously reported that a separate $200 billion data center project is also reportedly under consideration.

Possible shift in open-source release strategy

Meta may also revise its approach to open-source distribution. According to The Information, the company is considering integrating Llama 4 into Meta AI first and releasing it as open source at a later stage. This would depart from its previous practice of making new Llama models available as open source at launch.

The shift could help increase usage of Meta's own AI tools but might also weaken its standing in the open-source community, where it has positioned itself as a leading contributor. With companies like Deepseek and OpenAI expressing renewed interest in open models, Meta may face greater pressure to maintain its role.

Adopting a new architecture influenced by Deepseek

Technically, Llama 4 introduces a shift toward a "Mixture of Experts" (MoE) architecture for at least one version of the model. This design activates only the components relevant to a given task, offering potential efficiency gains over traditional dense models.

The decision to adopt MoE was subject to more than a year of internal debate. According to The Information, a key turning point came when the Chinese startup Deepseek demonstrated strong performance with relatively limited resources using this approach.

In response, Meta has reportedly set up several "war rooms" to study Deepseek's techniques, with teams analyzing its low-cost training methods and data collection strategies. The goal is to improve Llama's efficiency and competitiveness as new entrants continue to emerge.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: The Information