Content
summary Summary

Physicist Steve Hsu says he recently published a paper built around an idea generated by GPT-5. But he also warns that working with AI is like collaborating with a "brilliant but unreliable genius"—one whose mistakes even experts can miss.

Ad

His theoretical contribution on quantum field-theoretical foil independence appeared in Physics Letters B. The work examines the linearity of quantum evolution, a core issue in understanding quantum mechanics and any possible extensions to the theory.

To integrate AI into his research, Hsu uses his own "Generate-Verify" protocol. One model proposes an idea, and a second model checks it. The goal is to cut down on common LLM errors, ranging from basic calculation slips to deep conceptual flaws.

For this project, Hsu worked with GPT-5, Gemini 2.5-Pro, and Qwen-Max, which he considers particularly strong in physics and math. For the final verification rounds, he also used DeepSeek V3.1 and Grok-4. According to Hsu, routing outputs through multiple models can noticeably improve result quality.

Ad
Ad

Human expertise is still the safety net

In an accompanying paper on AI-assisted physics, Hsu argues that human oversight remains essential. Even advanced students, he says, can easily produce flawed results when using AI in frontier research. He explicitly compares working with large language models to collaborating with a "brilliant but unreliable genius."

"At present, human expert participation in the research process is still a necessity. Non-expert use of AI in frontier research (even by individuals, such as PhD students, with considerable background) is likely to lead to large volumes of subtly incorrect output," Hsu writes.

Hsu sees clear potential in his method and in generative AI broadly. He suggests using more complex verification steps, such as asking specific questions about the validity of previous outputs and requiring citations to technical papers to boost reliability.

He expects hybrid human-AI workflows to become standard in math, physics, and other formal sciences. As models gain precision, contextual understanding, and better symbolic control, Hsu believes they will act as "autonomous research agents" capable of generating hypotheses, checking derivations, and drafting manuscripts that pass peer review.

"Properly orchestrated, this synergy promises an era of accelerated discovery in which human insight and machine reasoning jointly advance our understanding of the fundamental laws of nature," he writes.

Recommendation

The race toward automated science

Companies like OpenAI are pushing toward greater automation of scientific research. OpenAI says it plans to develop autonomous research agents by early 2028. Beyond scientific progress, the company sees major economic potential. If such systems drive breakthroughs in areas like medicine, the benefits could be substantial for research institutions, businesses, and entire economies.

There are early signs that AI can speed up scientific work. Mathematician Terence Tao has said that AI tools have saved him hours on various tasks, from checking assumptions to generating program ideas. He currently views language models as mediocre but useful assistants rather than independent researchers.

OpenAI researcher Sebastien Bubeck describes a stronger example: GPT-5 tackled a complex mathematical problem for him, designed the solution, ran a simulation to verify it, and produced a full proof. According to Bubeck, the task would have taken him about a month. GPT-5 finished it in an afternoon, which he called the "most impressive LLM output" he has seen.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Physicist Steve Hsu published a paper based on an idea that he says came from GPT-5.
  • Hsu points out that, despite advances, human expertise is still essential since even experienced researchers can make mistakes that AI might not catch.
  • He believes collaboration between humans and AI has strong potential to improve the quality of scientific work.
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.