A research team from Meta and several French clinics has studied how children develop neural language representations—and found that AI models go through similar learning stages as the human brain.
Children need only a few million words to acquire language, but the brain mechanisms behind this process are still not well understood. A new study by Meta AI and the Rothschild Hospital in Paris sheds light on how language representations form in the brain, revealing striking parallels to large AI language models.
The researchers examined brain activity in 46 French-speaking participants ranging from age 2 to 46. All participants had electrodes implanted for epilepsy treatment. While listening to an audiobook of "The Little Prince," neural activity was recorded from over 7,400 electrodes. The goal was to track how language processing develops in the brain.
The results showed that even toddlers between two and five years old displayed clear responses to speech—such as to sounds like "b" or "k." These reactions occurred in specific auditory centers of the brain and followed a distinct time pattern. However, processing whole words—their meaning or grammar—was only observed in older children and in more advanced brain regions.
As children grow, these language processing patterns spread across larger areas of the brain. Responses to words start earlier, last longer, and become more pronounced—a sign that language processing becomes more complex with age.
AI models learn language similar to the human brain
To better understand how these representations develop, the team compared the neural data to the activations of two language models: wav2vec 2.0, an AI model that learns speech features from audio, and the large language model Llama 3.1. Both models were examined before and after training.
After training, the models' responses more closely resembled those in the human brain. Wav2vec, which learned from raw audio, developed a stepwise processing pattern—starting with simple sounds, then moving to more complex meanings. Llama 3.1, on the other hand, processed whole words from the start, similar to the brains of older children and adults.
The team found that Llama 3.1-style representations only appear in the brains of older children and adults—not in toddlers aged 2 to 5, who instead resemble the early, untrained state of an AI model. Only after more exposure to language do LLM-like activations appear in the brain.
According to the study's authors—including Meta's Jean-Rémi King—the development of language processing in the brain and the maturation of language models through training show structural similarities. Both biological and artificial systems seem to build a comparable hierarchy of language representations, though LLMs require far more data.
Biology Is Still More Efficient—But AI Helps Us Understand It
Despite these parallels, there are clear differences. Children acquire language with just a few million words, while LLMs need billions. Many cognitive abilities—like understanding syntactic dependencies or semantic nuances—remain out of reach for AI.
Still, the findings suggest that AI models can help researchers better study how language develops in the human brain. They offer a new way to trace language processing across age groups and compare the inner workings of biological and artificial systems.
One important limitation: Children under age two could not be included for medical reasons, even though these very early months are especially important for language development.