Here is an interesting take on LLM hallucinations by Andrej Karpathy

DALL-E 3 prompted by THE DECODER

Are hallucinations, false statements generated by large language models, a bug or a feature?

Andrej Karpathy, AI developer at OpenAI and former head of AI at Tesla, does not see hallucinations as a bug in large language models. On the contrary, they are their great strength.

Karpathy describes LLMs as "dream machines" that generate content based on their training data. The instructions given to the LLMs trigger a "dream" that is controlled by the model's understanding of its training data.

Usually, the content generated is useful and relevant. But when the dream takes a wrong or misleading path, it is called a hallucination. "It looks like a bug, but it's just the LLM doing what it always does," Karpathy writes.

Compared to traditional search engines, Karpathy sees LLMs at the opposite end of the creativity spectrum. While search engines have a "creativity problem" and can only return existing content, LLMs can generate new content from their training data. However, this creativity comes with the risk of generating hallucinations.

"An LLM is 100% dreaming and has the hallucination problem. A search engine is 0% dreaming and has the creativity problem," Karpathy writes.

Whether hallucinations are a problem depends on the application

However, Karpathy says that while hallucinations are a feature of large language models, they can be problematic when using LLMs, for example for personal assistants. Karpathy says he is working on a kind of "JARVIS" (Personal AI Assistant) at OpenAI.

LLM assistants are more complex than the underlying LLM and require additional methods to reduce hallucinations. One approach Karpathy mentions is retrieval augmented generation (RAG), which anchors the generated content more firmly in real data.

Other methods include exploiting inconsistencies between multiple examples, reflection, verification chains, decoding uncertainties from activations, and the use of tools.

Recommendation

AI research

MatterGen: Microsoft presents AI tools for generating and simulating new materials

According to Karpathy, these research areas are currently being explored to improve the accuracy and reliability of LLM assistants.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Here is an interesting take on LLM hallucinations by Andrej Karpathy

Whether hallucinations are a problem depends on the application

MatterGen: Microsoft presents AI tools for generating and simulating new materials

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

AI coding can make developers slower even if they feel faster

Google unveils MedGemma, an open-source AI model suite for medical applications

AI coding can make developers slower even if they feel faster

Musk unveils Grok 4 as xAI’s new AI model that beats OpenAI and Google on major benchmarks

"Cat attack" on reasoning model shows how important context engineering is

Here is an interesting take on LLM hallucinations by Andrej Karpathy

Whether hallucinations are a problem depends on the application

Share

Bank details