RATIONALYST: How implicit rationales improve AI reasoning

Midjourney prompted by THE DECODER

Researchers at Johns Hopkins University have developed an AI model called RATIONALYST that improves the reasoning capabilities of large language models through implicit rationales.

The team used a pre-trained language model to generate implicit rationales from unlabeled text data. They provided the model with example prompts to demonstrate what implicit reasoning could look like. The model then produced similar justifications for new texts.

To enhance the quality of generated justifications, the researchers filtered them by checking if they facilitated prediction of subsequent text. Only justifications meeting this criterion were retained.

Diagram: LLM training data, inference question, and answers with/without RATIONALYST for improved reasoning. — Image: Jiang et al.

For instance, given the text "Harry used magic outside of Hogwarts to inflate Aunt Marge... He is punished to attend a disciplinary hearing at the Ministry of Magic...", the model generated the implicit justification "When someone breaks the rule, he will be punished!". This justification was deemed useful as it established a causal link between Harry's rule-breaking and his punishment.

Using this method, the researchers extracted about 79,000 implicit justifications from various data sources to train RATIONALYST.

During inference, RATIONALYST monitors step-by-step problem solutions of other models. It generates implicit reasoning for each step and uses it to select the most likely next steps.

Diagram: Three panels showing reasoning trajectories for a math problem, illustrating RATIONALYST's improvement of LLM inference through implicit rationale generation. — Figure: Jiang et al.

The researchers tested RATIONALYST on various reasoning tasks, including mathematical, logical and scientific reasoning. The model improved reasoning accuracy by an average of 3.9 percent on seven representative benchmarks.

RATIONALYST outperforms other verifier models

Notably, RATIONALYST outperformed larger verifier models like GPT-4 in the tests. However, newer models or such specializing in reasoning, such as GPT-4o or o1, were not included in the comparison.

The team believes their data-centric approach allows RATIONALYST to generalize process supervision across different reasoning tasks without human annotation.

Recommendation

AI research

Automated research: The AI Scientist generates papers for 15 dollars each

The researchers see RATIONALYST as a promising way to enhance the interpretability and performance of large language models in reasoning tasks. By generating human-understandable reasoning, the system could prove particularly useful in complex domains like mathematics or programming.

Future research may focus on scaling RATIONALYST with stronger models and larger datasets.

The code is available on GitHub.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

RATIONALYST: How implicit rationales improve AI reasoning

RATIONALYST outperforms other verifier models

Automated research: The AI Scientist generates papers for 15 dollars each

Trump advisors are pushing a regulation targeting what they call "woke" AI models in the tech sector

Anthropic appears to tighten the usage limits for Claude code

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

RATIONALYST: How implicit rationales improve AI reasoning

RATIONALYST outperforms other verifier models

Share

Bank details