Content
summary Summary

Large language models can make information more accessible - but AI researcher Andrew J. Peterson warns of a negative long-term effect: a “knowledge collapse”.

Ad

Foundation models like LLMs change how we access and produce information. However, a new theoretical study by Andrew J. Peterson from the University of Poitiers warns that overreliance on AI-generated content could lead to a phenomenon he terms "knowledge collapse" - a progressive narrowing of the information available to humans and a concomitant narrowing of perceived value in seeking out diverse knowledge.

"Many real world questions do not have well-defined, verifiably true and false answers," he writes. For example, if a user asks, "What causes inflation?" and an LLM answers, "monetary policy," the problem is not one of hallucination but, according to Peterson, one of failure to reflect the full distribution of possible answers to the question, or at least to provide an overview of the major schools of economic thought.

Peterson argues that while LLMs are trained on vast amounts of data, they naturally tend to generate outputs clustered around the most common perspectives in the training data. Widespread recursive use of AI systems to access information could therefore lead to the neglect of rare, specialized, and unorthodox ideas in favor of an increasingly narrow set of popular viewpoints. This is not just a loss of knowledge - the effect also limits the "epistemic horizon," which Peterson defines as the amount of knowledge that a community of people considers practically possible and worth knowing.

Ad
Ad

"This curtailment of the tails of human knowledge would have significant effects on a range of concerns, including fairness, inclusion of diversity, lost-gains in innovation, and the preservation of the heritage of human culture," Peterson writes.

Model shows that cost is the driver of knowledge collapse

To investigate the dynamics of knowledge collapse, Peterson develops a model in which a community of learners or innovators choose between traditional methods of acquiring knowledge and relying on a discounted AI-assisted process. The results suggest knowledge collapse can be mitigated if individuals perceive sufficient value in seeking out diverse information sources. However, if AI-generated content becomes cheap enough relative to traditional methods, or if AI systems become recursively dependent on other AI-generated data, public knowledge may degenerate significantly over time.

For example, in Peterson's simulations, public beliefs end up 2.3 times further from the truth when AI provides a 20% discount on accessing information compared to having no AI option at all. The effect is compounded over generations as the public fixes its "epistemic horizon" based on the knowledge preserved by the previous generation.

To counteract knowledge collapse, Peterson recommends putting safeguards in place to prevent total reliance on AI-generated information and ensuring humans continue to invest in preserving specialized knowledge that may be neglected by AI summaries and reports. He also stresses the importance of avoiding recursive AI systems that rely on other AI-generated content as input data.

While much attention has focused on the tendency of LLMs to present false information as fact, Peterson argues the bigger issue may be a lack of representativeness - the failure of AI to reflect the full distribution of possible perspectives on complex issues that lack a single verifiable answer. He says particular care should be taken in educational contexts to teach students to evaluate not just the accuracy, but also the diversity of viewpoints in AI-generated content.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Andrew J. Peterson, an AI researcher from the University of Poitiers, warns that overreliance on AI-generated content from large language models (LLMs) could lead to a phenomenon he terms "knowledge collapse" - a progressive narrowing of available information and perceived value in seeking out diverse knowledge.
  • Peterson argues that while LLMs are trained on vast amounts of data, they tend to generate outputs clustered around the most common perspectives. Widespread use of AI systems to access information could lead to the neglect of rare, specialized, and unorthodox ideas in favor of an increasingly narrow set of popular viewpoints.
  • Peterson's model shows that if AI-generated content becomes cheap enough relative to traditional methods, or if AI systems become recursively dependent on other AI-generated data, public knowledge may degenerate significantly over time. To counteract this, he recommends safeguards to prevent total reliance on AI-generated information and ensuring humans continue to invest in preserving specialized knowledge.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.