Meta's head of AI research Yann LeCun is pushing back against recent AGI hype, arguing that human-like artificial intelligence isn't coming anytime soon. According to LeCun, the path to true AI doesn't lie in large language models alone, but requires a combination of sensory learning, emotions, and the ability to model and reason about the world.
Speaking at a Johns Hopkins Bloomberg Center panel, LeCun explained that while general AI (AGI) isn't centuries or even decades away, we're still several years from achieving it.
His comments come as some in the AI community suggest OpenAI's latest o3 model might qualify as a baby AGI, pointing to its impressive performance on math and programming tests and its potential to scale even further.
The conversation is particularly tricky because experts can't agree on what AGI actually means. Some define it as human-like, flexible intelligence, while companies like OpenAI consider it the point where AI can handle most human jobs.
Like many these days, LeCun believes that large language models (LLMs) are hitting a wall because AI labs are running out of natural text data to train them on - a position he's held for years.
LeCun notes that even strong advocates of the text-to-AGI approach are changing their minds, pointing to OpenAI's former chief researcher Ilya Sutskever, who once championed scaling up text models as the path to general AI but has apparently shifted his position. "Ilya was famously a believer in that concept and now apparently not anymore," LeCun says.
The path to smarter AI
LeCun argues that human-like AI can't be achieved through text training alone - systems need to include sensory input. He points to the current lack of capable household robots as proof, suggesting the issue isn't the robotics but the AI's limited intelligence, something he aims to address with his research project V-JEPA.
To put things in perspective, LeCun explains that a four-year-old child processes about the same amount of visual information during 16,000 waking hours as the largest language models do with text. Meta is gathering video data that shows how objects and environments interact to address this issue for future AI systems.
LeCun also believes future AI systems will need emotions to set goals and understand consequences. According to LeCun, this emotional component isn't optional, but an "inseparable component of their design." It's a key part of Meta's AI vision for the "next few years," along with the ability to model the world, reason, and plan ahead, LeCun explains.
As for AI-generated misinformation, LeCun takes a less concerned stance. He argues that AI isn't enabling hate speech and disinformation but rather offers the best tools to counter them - as long as those fighting misinformation have access to more sophisticated AI systems than those creating it.
LeCun finds OpenAI's earlier warnings about GPT-2 particularly telling. When the company initially withheld the model over concerns about dangerous fake news, he considered their reaction "ridiculous" given the model's modest capabilities. He points out that similar but more powerful systems have been freely available for years now without causing the predicted problems.