Quiet-STaR trains AI to think before it speaks, promising a leap in machine understanding

Researchers at Stanford University have developed a method called "Quiet-STaR" that enables AI systems to learn to think between the lines. This could pave the way for more versatile and efficient AI that can better solve complex tasks.

When humans write or speak, we often pause to think. We consider how best to phrase an argument, or what the other person is thinking.

This "thinking" is hidden between the lines of almost all texts - for example, in the intermediate steps of mathematical proofs that are not explicitly mentioned. So far, AI has struggled to capture such unspoken thought processes. But that could change.

Internal reasoning helps LLMs generate better answers

Quiet-STaR (Quiet Self-Taught Reasoner) teaches an LLM to think quietly before it speaks. At each point in a text, the AI generates possible reasons why the text continues one way rather than another.

Through trial and error, it learns which considerations lead to the most likely continuations - it thinks before it "speaks", i.e. it continues to generate the text.

The technology is based on the "Self-Taught Reasoner" (STaR), which teaches AI systems to derive reasons from a few examples and to learn from correct answers. However, while STaR only works for certain question-answer tasks, Quiet-STaR is designed to teach language models to infer implicit reasoning from any text.

Video: Zelikman et al.

This sounds simple, but it poses significant challenges: The AI has to learn how to generate "thoughts" and how to use them effectively. It is also computationally intensive to calculate and evaluate many continuations for each text passage.

The researchers are tackling this problem with sophisticated sampling algorithms and techniques such as "teacher forcing," in which the system is gradually introduced to the correct continuations.

Recommendation

AI research

Wait a minute! Researchers say AI's "chains of thought" are not signs of human-like reasoning

The results are impressive: without special training on specific tasks, the AI's ability to answer comprehension questions in common AI tests improved by more than ten percent in some cases (GSM8K from 5.9 percent to 10.9 percent, CommonsenseQA from 36.3 percent to 47.2 percent).

These improvements increased with the length of the generated explanations. They were particularly helpful for difficult passages of text. And the longer the AI "thought," the better the results.

It is possible that by recognizing the logic between the lines in different textual data, the AI becomes more adaptable and better able to apply its knowledge to new problems. It learns to understand contexts instead of just memorizing them.

However, the technology still has limitations. It has only been tested on a relatively small 7B LLM. And the system has yet to learn how to dynamically decide when it is worth thinking about a passage of text - otherwise, the extra thinking steps waste too much computing power. The researchers see this as a "natural extension" and believe that even greater improvements will be possible with larger models.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Quiet-STaR points the way to more intelligent and versatile AI systems. Instead of being trained only on narrowly defined tasks, they could learn to understand the logic behind texts and conversations on their own. They could understand arguments better, formulate theories, and use language more creatively and efficiently.

Does Quiet-STaR have anything to do with OpenAI's Q*?

There are interesting parallels between the Stanford researchers' Quiet STaR method and the speculation surrounding OpenAI's mysterious Q* system, which was hailed as a major breakthrough last fall.

Both methods aim to improve the reasoning and problem-solving capabilities of AI beyond what current language models such as GPT-4 can achieve.

While Quiet-STaR teaches language models to generate and learn from possible justifications for continuing at any point in a text, Q* aims to combine language models with planning algorithms. Both are approaches to teaching AI to "reason" or "think" step by step to arrive at better solutions.

Another common theme is the importance of test-time compute: The more time the AI has to think, the better the results, both in Quiet-STaR and presumably in Q*. This is reminiscent of chess programs like AlphaZero, which increase their performance if they are allowed to compute for longer.

And of course, the name: Quiet-STaR could be abbreviated to "Q*".

Quiet-STaR trains AI to think before it speaks, promising a leap in machine understanding

Internal reasoning helps LLMs generate better answers

Wait a minute! Researchers say AI's "chains of thought" are not signs of human-like reasoning

Does Quiet-STaR have anything to do with OpenAI's Q*?

Researchers used 1,600 YouTube fail videos to show AI models struggle with surprises

Alibaba's new GPT-4o competitor Qwen VLo is no longer open source

Researchers hide prompts in scientific papers to sway AI-powered peer review

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Quiet-STaR trains AI to think before it speaks, promising a leap in machine understanding

Internal reasoning helps LLMs generate better answers

Does Quiet-STaR have anything to do with OpenAI's Q*?

Share

Bank details