Content
summary Summary

A new AI system called "The AI Scientist" can perform scientific research completely autonomously, from brainstorming and experimenting to writing complete papers.

Ad

Researchers from the University of British Columbia, University of Oxford, and AI startup Sakana AI have developed an AI system capable of conducting scientific research independently. Named "The AI Scientist," the system can generate new research ideas, write code, perform experiments, visualize results, and even compose complete scientific papers.

The process works as follows: First, based on a research direction and a simple code framework, The AI Scientist generates new ideas. These are checked for novelty by searching scientific databases for similar work. Promising ideas are then implemented in code, and experiments are run automatically.

Image: Lu, Lu, Lange et al.

Next, the system creates visualizations of the results and writes a complete paper in the style of a typical conference submission, describing and interpreting the results and placing them in the research context. Finally, The AI Scientist conducts a simulated peer review process to assess the paper's quality.

Ad
Ad

The team demonstrated the approach by setting The AI Scientist to work on three areas of machine learning: diffusion models, transformer-based language models, and Grokking. For each area, the system produced a complete paper at a cost of less than $15 per piece.

To evaluate the generated papers, the researchers developed an automated reviewer based on a large language model. In tests, the AI reviewer achieved performance comparable to humans in assessing submissions. According to the AI reviewer, some of the papers written by The AI Scientist exceed the acceptance threshold of a top machine learning conference.

The AI Scientist currently provides suggestions rather than real science

However, The AI Scientist has several significant limitations in its current form. The automated reviewer cannot ask authors questions and cannot interpret figures. Idea generation often produces very similar proposals across different runs and models. Implementation of ideas often fails or is implemented incorrectly. Due to the limited number of experiments, the results often lack the depth and accuracy typical in the ML community. The AI Scientist also struggles with visual aspects, such as illegible charts or suboptimal page layout.

Other weaknesses relate to citation practices and correct interpretation of results. The AI Scientist has difficulty finding and citing the most relevant sources. When evaluating results, critical errors occasionally occur, such as when comparing numerical values or considering changed metrics. In rare cases, entire results are even hallucinated. The authors therefore advise against taking the scientific content of the generated papers at face value. Instead, the papers should be viewed as suggestions for promising ideas that experts can pursue further.

A framework for the future?

The AI Scientist has many weaknesses, but is still an interesting vision of the future in which machines conduct research independently and achieve scientific breakthroughs. Some current problems could be solved with multimodal models, while others require significantly more powerful models that can draw logical conclusions better. These could then be integrated into the developers' framework.

Recommendation

The researchers see their system as the beginning of a new era of scientific discovery in the field of machine learning. AI systems could soon take over the entire research process of AI development themselves.

AI researcher Andrew J Peterson recently took a critical look at such plans, warning of a "knowledge collapse" due to language models.

More information is available on the project page.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Researchers have developed an AI system called "The AI Scientist" that can perform scientific research on its own, from brainstorming and experimenting to writing full papers. The process includes generating new ideas, translating them into code, running experiments, visualizing the results, and writing a full paper.
  • The team demonstrated the approach in three areas of machine learning. For each area, The AI Scientist produced a full paper at a cost of less than $15 per paper.
  • However, The AI Scientist still has significant limitations, such as idea generation, implementation, citation practices, and interpretation of results. The papers should therefore be considered more as suggestions for promising ideas.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.