Ad
Skip to content

Fields Medalist says ChatGPT 5.5 Pro delivered "PhD-level" math research in under two hours with zero human help

Fields Medalist Timothy Gowers had ChatGPT 5.5 Pro tackle open problems in number theory. The model improved an exponential bound to a polynomial one in under an hour. An MIT researcher involved calls the key idea “completely original.” Gowers’ takeaway: the bar for mathematical contributions is now proving something LLMs can’t.

Read full article about: Google Deepmind takes a stake in EVE Online studio to test AI models

Google Deepmind is acquiring a minority stake in the studio behind the space MMO EVE Online and will use the game as a testing ground for AI models. At the same time, developer CCP Games is buying itself back from South Korean owner Pearl Abyss for $120 million - a significant discount from the $225 million Pearl Abyss paid in 2018 - and rebranding as Fenris Creations.

Deepmind is operating an offline version of EVE Online on a local server to study how models handle long-term planning, memory, and continuous learning. The live server Tranquility remains untouched and unaffected by the research.

Deepmind has a long history of using games as AI testing environments, from AlphaGo to AlphaStar to Atari benchmarks. According to Deepmind director Alexandre Moufarek, EVE's complexity makes it a strong sandbox for general AI research. Fenris CEO Hilmar Veigar Pétursson plans to share more details in mid-May.

Comment Source: EVE

Same prompt, different morals: how frontier AI models diverge on ethical dilemmas

A new benchmark puts leading language models through 100 everyday ethical scenarios, from data misuse in sales to protocol violations in oncology. Behind the results lies a bigger question: who decides what an AI is allowed to do, and whose ethics does it follow?

Google Deepmind's "AI co-clinician" beats GPT-5.4 in blind doctor tests but still trails experienced physicians

Google Deepmind is building an “AI co-clinician” to help doctors care for patients. The system shows promising results in simulation studies but still trails experienced physicians. The research also shows why ChatGPT’s voice mode isn’t ready for serious tasks, let alone medical consultations.