Large AI language models are one of the greatest achievements of AI research recently. Meta's head of AI, Yann LeCun, describes their limitations - and they are not technical.
When the first people gained access to OpenAI's powerful text AI GPT-3, a tremendous momentum in app creation developed - and a certain mystery: What knowledge, what abilities could be concealed in these 96 layers and 175 billion parameters? Could there be more hidden in the depths of the model than the simple completion of sentences? A deeper understanding of the world, perhaps the key to a machine's common sense and thus to a human-like AI?
GPT-3 was the initial spark for the spread of large language models
The introduction of GPT-3 provided the impetus for the development of more advanced language models. Since then, numerous other models have appeared, some of which are even larger and more powerful, offering advantages in an increasing number of application scenarios, some of which go beyond direct text generation.
For example, the language understanding of large language models is fundamental to the graphics revolution currently underway with DALL-E, StableDiffusion and the like, or it helps in the development of robots that can be used in everyday life.
From today's perspective, language models have not yet made a clear contribution on the path to human-like AI. They do produce intelligible text, so credibly that former Google developer Blake Lemoine claimed a Google chatbot had a consciousness. But they don't understand.
Doomed to shallowness
In a joint essay with AI researcher Jake Browning, Meta's AI chief Yann LeCun describes why he believes large AI language models cannot lead the way to human-like AI.
The two scientists argue that language contains only a small portion of human knowledge. Much of that knowledge, and likewise the knowledge of animals, exists neither in verbal nor symbolic form, they say. Accordingly, large language models could not come close to human intelligence, "even if trained from now until the heat death of the universe."
The limitation, therefore, is not artificial intelligence but "the limited nature of language," the researchers write. Today's AI language systems are impressive, but "doomed to a shallow understanding that will never approximate the full-bodied thinking we see in humans."
Language data training is used by AI to acquire a small portion of human knowledge through a tiny bottleneck, according to the researchers. Language models, therefore, resembled a mirror that gives the illusion of depth by reflecting everything, but in reality, is only a few centimeters thick. "If we try to explore its depths, we bump our heads," the researchers write.
AI is not the problem - it's language
Any form of language, they say, is just a very compressed and "highly specific, and deeply limited, kind of knowledge representation." However, human understanding of language often depends on a more in-depth understanding of the context in which a sentence or paragraph is placed.
Understanding is influenced, for example, by a shared perception of situations or by knowledge about social roles. Research on children's text comprehension has shown that background knowledge about the text plays a crucial role in comprehension, the researchers say.
"Abandoning the view that all knowledge is linguistic permits us to realize how much of our knowledge is nonlinguistic," the researchers write, citing as an example an IKEA instruction manual that shows only illustrations and does without textual instructions.
In the quest for machine common sense, researchers would therefore have to think about systems that focus on the world itself - and not on the words used to describe it.
LeCun proposed in early March an AI architecture made up of several modules modeled after the human brain. At the heart of this architecture is the world model module, which is designed to learn abstract representations of the world and ignore unimportant details to make predictions about the world - just as humans constantly do.