Content
summary Summary

Leading AI companies are changing course. Instead of developing ever-larger language models, they are focusing on test-time compute, which uses more processing power during model execution rather than initial training.

Ad

Three sources close to the situation tell Reuters that major AI labs are running into walls. Training these massive LLMs costs tens of millions of dollars, and the complex systems often break down. It can take months just to know if a model works as intended.

The slowdown seems to be hitting everyone. The Information recently reported that OpenAI's next big model, Orion, is barely improving on GPT-4o. Google is reportedly struggling with similar issues on Gemini 2.0, while Anthropic is rumored to have paused work on its Opus 3.5 model (Update: Anthropic-CEO Dario Amodei says, "the aim here is to shift the curve and then at some point there's going to be an Opus 3.5.")

"The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing," says OpenAI co-founder Ilya Sutskever, who now runs his own AI lab, Safe Superintelligence (SSI). Sutskever stresses that what's important now is to "scale the right thing."

Ad
Ad

This is quite a turn for Sutskever, who once pushed the "bigger is better" approach that defined OpenAI's GPT models. At SSI's recent funding round, he said he wanted to try a different approach to scaling than OpenAI.

"Everyone just says scaling hypothesis. Everyone neglects to ask, what are we scaling?" Sutskever said.

Having just left OpenAI last May, he is likely aware of OpenAI's latest model, o1, which follows the new scaling paradigm - unless plans have changed since he left.

AI labs try new approaches

AI labs are now looking at test-time compute, giving models more time to work through problems. The goal is to create AI systems that don't just calculate probabilities but think through problems step by step. Instead of quick answers, these models generate several solutions, evaluate them, and pick the best one.

OpenAI CEO Sam Altman said in early November that his company would focus on its new o1 model and its successors. Reuters reports that other major labs such as Anthropic, xAI, Meta, and Google DeepMind are trying similar methods.

Recommendation

Conventional language model development may continue, even with smaller gains because companies may end up using both approaches for an optimal cost-benefit scenario. For example, OpenAI's o1 does math better, while GPT-4 writes text more efficiently.

This shift might shake up Nvidia's control of AI hardware. While Nvidia dominates in graphics cards for training large language models, the move to test-time compute creates room for other chipmakers. Companies like Groq are making specialized chips for these tasks, though Nvidia's products still work well here too.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • AI companies are moving away from building bigger language models, focusing instead on "test-time compute" as traditional pre-training scaling methods reach their limits.
  • The test-time compute approach gives AI models extra processing time to generate multiple solutions, evaluate them systematically, and pick the best one.
  • This shift could affect Nvidia's dominant position in the graphics card market due to increased competition in inference chips, though Nvidia's products remain capable of test-time compute tasks.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.