Terence Tao, widely regarded as a mathematical prodigy, says that AI still lacks what he calls a mathematical "sense of smell."
According to Tao, even when generative AI produces flawed proofs, they often look perfect on the surface. However, Tao points out, "the errors often really subtle and then when you spot them, they're really stupid. No human would have actually made that mistake."
What's missing, he argues, is what he terms a "metaphorical mathematical smell" – the human intuition that warns you when something just doesn't add up, a faculty for which, he notes, "it's not clear how to get the AI to duplicate that eventually."
This kind of gut feeling still can't be reproduced by AI, according to Tao, which is why human judgment remains crucial in mathematics. Generative models, in particular, tend to get stuck if they take the wrong approach. Tao observes that "where the AI really struggles right now is knowing when it's made a wrong turn." This is unlike hybrid AI systems that combine neural networks with symbolic reasoning.
"AlphaZero and similar systems make progress on go and chess. In some sense, they have developed a 'sense of smell' for go and chess positions, knowing that a position is good for white or for black. They can't initiate why. But just having that 'sense of smell' lets them strategize. So, if AIs gain that ability to have a sense of the viability of certain proof strategies, you could propose to break up a problem into two small subtasks, and they might say, 'Well, this looks good; two tasks look like they're simpler than your main task, and they still have a good chance of being true.'"
Terence Tao
AlphaZero uses Monte Carlo Tree Search (MCTS) as a "symbolic framework" to select moves during play and training. This search algorithm explores possible game paths as symbolic states. Still, AlphaZero is fundamentally a deep reinforcement learning system powered by neural networks, learning through self-play and capturing its "knowledge" in millions of parameters.
Some believe that combining the strengths of large language models with symbolic reasoning could lead to major advances in AI, since pure LLMs - even those with some reasoning abilities - may run into dead ends.
Tao has previously described OpenAI's reasoning model o1 as "mediocre, but not completely incompetent"; a research assistant capable of handling routine tasks but still lacking creativity and flexibility. He's also involved in developing the FrontierMath benchmark, which sets especially challenging math problems for AI systems.