Content
summary Summary

Researchers at Johns Hopkins University have developed an AI model called RATIONALYST that improves the reasoning capabilities of large language models through implicit rationales.

Ad

The team used a pre-trained language model to generate implicit rationales from unlabeled text data. They provided the model with example prompts to demonstrate what implicit reasoning could look like. The model then produced similar justifications for new texts.

To enhance the quality of generated justifications, the researchers filtered them by checking if they facilitated prediction of subsequent text. Only justifications meeting this criterion were retained.

Diagram: LLM training data, inference question, and answers with/without RATIONALYST for improved reasoning.
Image: Jiang et al.

For instance, given the text "Harry used magic outside of Hogwarts to inflate Aunt Marge... He is punished to attend a disciplinary hearing at the Ministry of Magic...", the model generated the implicit justification "When someone breaks the rule, he will be punished!". This justification was deemed useful as it established a causal link between Harry's rule-breaking and his punishment.

Ad
Ad

Using this method, the researchers extracted about 79,000 implicit justifications from various data sources to train RATIONALYST.

During inference, RATIONALYST monitors step-by-step problem solutions of other models. It generates implicit reasoning for each step and uses it to select the most likely next steps.

Diagram: Three panels showing reasoning trajectories for a math problem, illustrating RATIONALYST's improvement of LLM inference through implicit rationale generation.
Figure: Jiang et al.

The researchers tested RATIONALYST on various reasoning tasks, including mathematical, logical and scientific reasoning. The model improved reasoning accuracy by an average of 3.9 percent on seven representative benchmarks.

RATIONALYST outperforms other verifier models

Notably, RATIONALYST outperformed larger verifier models like GPT-4 in the tests. However, newer models or such specializing in reasoning, such as GPT-4o or o1, were not included in the comparison.

The team believes their data-centric approach allows RATIONALYST to generalize process supervision across different reasoning tasks without human annotation.

Recommendation

The researchers see RATIONALYST as a promising way to enhance the interpretability and performance of large language models in reasoning tasks. By generating human-understandable reasoning, the system could prove particularly useful in complex domains like mathematics or programming.

Future research may focus on scaling RATIONALYST with stronger models and larger datasets.

The code is available on GitHub.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Researchers at Johns Hopkins University have developed RATIONALYST, an AI model for improving the reasoning capabilities of large language models by implicitly reasoning from unlabeled text data.
  • The system generates and filters justifications for texts to monitor the reasoning process. RATIONALYST was trained from about 79,000 extracted implicit rationales to check step-by-step problem solutions.
  • In tests on various reasoning tasks, RATIONALYST improved accuracy by an average of 3.9 percent, outperforming larger verifier models such as GPT-4. The researchers see this as a promising approach to improving the interpretability and performance of language models in reasoning.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.