Terence Tao says GPT-5.2 Pro cracked an Erdős problem, but warns the win says more about speed than difficulty
Key Points
- Mathematician Terence Tao reports that OpenAI's GPT-5.2 Pro has solved an Erdős problem "more or less autonomously," marking a notable achievement for AI in mathematical research.
- But Tao cautions against overinterpreting the result: Many Erdős problems have never been systematically studied, and he estimates only one to two percent of open mathematical problems are simple enough to be solved with current AI tools with minimal human assistance.
- For Tao, the most interesting aspect isn't the solution itself, but AI's ability to rapidly create and revise different versions of mathematical texts.
Mathematician Terence Tao just documented a milestone in applying AI to math problems. But he's also warning people not to read too much into it.
Mathematician Paul Erdős spent his lifetime formulating hundreds of open problems. These so-called Erdős problems range widely in difficulty: some rank among the hardest unsolved questions in math, while others are more like side notes that nobody ever seriously tackled. The website erdosproblems.com keeps track of them all.
GPT-5.2 Pro cracks an Erdos problem on its own
According to mathematician Terence Tao, AI tools have hit a milestone: Problem #728 was solved "more or less autonomously" by ChatGPT after some initial feedback, staying true to the original question without just pulling an existing solution from the literature.
That's a significant step compared to previous claims. OpenAI researchers had said before that a GPT model "found" the solution to an Erdős problem. Technically true, but the AI had just dug up an existing solution through a literature search. It hadn't actually developed a new proof.
On January 4, GPT-5.2 Pro, OpenAI's most capable model, produced a proof for a tightened version of the problem. Another AI tool called Aristotle then translated this proof into Lean, a formal language that can automatically verify math proofs for correctness. The AI-generated proof had some minor errors, but Aristotle fixed them automatically.
The real win is speed, not the solution itself
For Tao, the most interesting part is the speed at which you can now draft and revise mathematical text. Several community members used different AI tools to translate the formal proof into plain language, fill in gaps, and connect it to existing research.
According to Tao, the result is "still somewhat clunky and 'AI' in feel" but it's readable enough to follow the proof's core ideas. After a few rounds of cleanup, Tao says the final version falls "within ballpark of an acceptable standard for a research paper."
"This is sharp contrast to existing practice where the effort required to produce even one readable manuscript is quite time-consuming," Tao writes. He's previously expressed optimism about AI helping "industrialize" math, which could speed up scientific progress overall.
Context matters when evaluating AI breakthroughs
At the same time, Tao is careful to temper expectations. In a GitHub wiki tracking AI contributions to Erdos problems, he lays out several caveats for interpreting these wins.
Erdős problems vary in difficulty by "several orders of magnitude," and many of the easier ones have never been seriously studied. So if a problem sat around for 50 years before an AI cracked it, that doesn't mean it "resisted all human efforts" for half a century. More likely, nobody ever really tried.
There's also the issue of incomplete literature reviews: it's already happened multiple times that an AI "solved" a problem listed as open, only for someone to discover the solution had already been published. And since failures rarely get reported, Tao warns against drawing conclusions about these tools' actual success rates.
Harder problems still need human guidance
In a follow-up Mastodon post, Tao points to a pattern: the more AI involvement, the simpler the solution tends to be. That's partly a selection effect, he explains. Autonomous AI workflows scale well, so they're better suited for tackling the long tail of obscure problems that often have straightforward solutions.
More complex problems still require humans and AI working together. The model handles key calculations or pieces of the proof, while humans map out the overall strategy. Tao estimates that only around one to two percent of currently open Erdos problems are simple enough for today's AI tools to solve with minimal human help.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now