Large language models are powerful imitators, but not innovators

According to a psychological study, large AI models such as OpenAI's ChatGPT are strong imitators that mimic human content but are not innovative.

In their study, published in the journal Perspectives on Psychological Science, researchers Eunice Yiu, Eliza Kosoy, and Alison Gopnik emphasize that LLMs efficiently mimic existing human content by extracting patterns from large amounts of data and generating new responses.

LLMs are powerful imitators, they say, comparable to historical technologies such as language, writing, printing, libraries and the Internet that have fostered cultural transfer and evolution.

However, they are not focused on truth-seeking, as human perceptual or action systems are, but on extracting and transferring existing knowledge, often without a deep understanding of the causal or relational aspects of the information being processed.

Limitations of AI innovation.

The study also explores the concept of innovation, suggesting that LLMs lack the capacity for innovation found in young children. The authors used tool use and innovation as a point of comparison.

They found that while AI models could replicate familiar tool uses, they had difficulty in tasks that required inventing new tool uses or discovering new causal relationships.

Tests of tool use and causal relationships found that AI models had difficulty coming up with innovative solutions to problems compared to children. The models were able to identify superficial commonalities between objects, but were less capable when it came to selecting a new functional tool for problem-solving.

The study involved the use of a virtual "Blicket detector," a machine that responds to certain objects with light and music, while remaining quiet for other objects. The researchers found that young children were able to figure out the causal structure of the machine with just a few observations.

In contrast, LLMs such as OpenAI's ChatGPT, Google's PaLM, and LaMDA failed to derive the correct causal overhypotheses from the data. Despite their extensive training, LLMs were unable to generate the relevant causal structures compared to children. According to the research team, this demonstrates their limitations in deriving inferences about novel events or relationships.

Recommendation

AI research

Study reveals major reasoning flaws in smaller AI language models

Young children, in contrast, learned novel causal overhypotheses from only a handful of observations, including the outcome of their own experimental interventions, and applied the learned structure to novel situations. In contrast, large language models and vision-and-language models, as well as both deep reinforcement learning algorithms and behavior cloning, struggled to produce the relevant causal structures, even after massive amounts of training compared with children.

From the paper

Overhypotheses or overarching hypotheses refer to the application of broad and general rules to new situations. Discovering new uses for tools requires a creative search for new causal structures or applications for objects.

Scale alone is not enough

The authors point out that LLMs bring both unique characteristics and challenges to research. For example, they are refined through reinforcement learning with human feedback, the nature and effects of which are not yet fully understood.

Similarly, the researchers argue that while scaling language models improves performance on various tasks, it does not equate to human-like learning.

The study concludes that large language models are valuable as cultural technologies that can mimic millions of human authors. However, the authors caution that while these AI models can help process known information efficiently, they are not themselves innovative.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

They suggest that AI may need more than just large amounts of language and image data to match the performance of a human child, and that ever-larger models are not the answer.

"A child does not interact with the world better by increasing their brain capacity. Is building the tallest tower the ultimate way to reach the moon? Putting scale aside, what are the mechanisms that allow humans to be effective and creative learners? What in a child’s "training data" and learning capacities is critically effective and different from that of LLMs? Can we design new AI systems that use active, self-motivated exploration of the real external world as children do? And what we might expect the capacities of such systems to be?" the team writes.

Large language models are powerful imitators, but not innovators

Limitations of AI innovation.

Study reveals major reasoning flaws in smaller AI language models

Scale alone is not enough

SciArena lets scientists compare LLMs on real research questions

Microsoft’s MAI-DxO boosts AI diagnostic accuracy and cuts costs by nearly 70 percent

Researchers say they may have found a ladder to climb the "data wall"

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

New Othello experiment supports the world model hypothesis for large language models

ChatGPT might be draining your brain, MIT warns - what ‘cognitive debt’ means for you

Large language models are powerful imitators, but not innovators

Limitations of AI innovation.

Scale alone is not enough

Share

Bank details