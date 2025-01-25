AI in practice
Deepseek puts pressure on Meta with open source AI models at a fraction of the cost

Midjourney prompted by THE DECODER
Deepseek puts pressure on Meta with open source AI models at a fraction of the cost
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
summary Summary

In recent weeks, Chinese AI startup Deepseek has shown that cutting-edge AI development doesn't require massive budgets, putting pressure on established AI labs. Meta CEO Mark Zuckerberg is doubling down on AI investments.

Deepseek's latest model shows just how efficient AI development can be. Their Deepseek-V3 language model performs on par with the world's leading AI systems, but cost just $5.6 million to train - a tiny fraction of what larger companies typically spend.

Deepseek-V3 needed only 2.78 million GPU hours of training time, while Meta's smaller Llama-3 model (with 405 billion parameters) required about eleven times that amount. The company followed up with Deepseek-R1, a reasoning model that matches OpenAI's o1 - something Meta hasn't even released yet.

Meta responds with major expansion plans

Zuckerberg took to Facebook recently to outline his company's response. In 2025, Meta aims to develop an AI assistant that can serve more than a billion people, upgrade Llama 4 to compete with the best models available, and create an "AI engineer" to help with its research and development. "This will be a defining year for AI," Zuckerberg wrote.

To support these goals, Meta is building a massive data center that will use more than two gigawatts of power. The company plans to bring online about one gigawatt of computing power and over 1.3 million GPUs in 2025 alone, backed by investments of $60-65 billion and significant team expansion.

Screenshot of a Facebook post by Meta CEO Mark Zuckerberg. The size of the planned Meta data centre overlaid on a map of Manhattan shows its massive proportions.
The sheer size of Meta's planned 2GW+ data center is illustrated by a comparison with Manhattan. | Image: via Facebook

Meta's AI chief researcher Yann LeCun sees Deepseek's success as a win for open source rather than a sign of Chinese dominance. He points out that Deepseek built on openly available research and profited from it, but also contributed new ideas others can build on. "This is the power of open research and open source," LeCun says. He praised their V3 model as "excellent" when it launched in late 2024.

Screenshot: Yann LeCun's thread post about DeepSeek's AI developments and their impact on the market
Meta's Head of AI, Yann LeCun, comments on the developments at DeepSeek on Threads. His reaction underlines the growing importance of Chinese AI companies in the global competition. | Image: via Threads

"It started with Deepseek-V3"

According to an anonymous post on Teamblind, a forum for verified Big Tech employees, Meta's AI department is feeling the pressure. The post claims Deepseek-V3 has already outperformed Meta's unreleased Llama-4 in benchmarks, leading to concerns about the department's high operating costs when a relatively unknown Chinese company can achieve better results on such a tight budget, pointing out that a single department head's salary exceeds Deepseek's entire training budget. Deepseek's R1 reasoning model is causing even more headaches for the team.

The post suggests Meta's engineers are working frantically to analyze and adopt Deepseek's technology. It criticizes how Meta's AI division, originally meant to be small and technically focused, has grown bloated as employees rushed to join the AI trend.

Screenshot of a social media post about Meta's internal reaction to DeepSeek's AI progress and cost pressure.
An anonymous internal report reveals the allegedly tense situation at Meta's AI department following the success of DeepSeek. | Image: via Teamblind

The timing of both Zuckerberg's and LeCun's public statements, both appearing almost at the same time, suggests that they have decided internally to indirectly respond to these rumors and the social media conversations they have sparked.

AI in practice

Anthropic study reveals how malicious examples can bypass LLM safety measures at scale

  • According to an anonymous post on Teamblind, Meta's Deepseek-V3 model has outperformed the unreleased Llama-4 model in benchmarks, with a single AI leader at Meta earning more than the total training costs of Deepseek-V3.
  • Meta CEO Mark Zuckerberg has announced plans to invest in a new data center with two gigawatts of power and 1.3 million GPUs by 2025, with the company expected to spend between $60 and $65 billion on the project.
  • Meta's AI chief researcher Yann LeCun believes Deepseek's success demonstrates the strength of open source, as the company has benefited from open research and developed new ideas based on it.
Sources
LeCun via Threads Zuckerberg via Facebook
