Content
summary Summary

OpenAI doesn't see AI as a replacement for doctors - but as a way to replace not going to the doctor at all.

Ad

"I really don’t think you end up displacing doctors," said Nick Turley, who heads up ChatGPT at OpenAI, on the company's official podcast. “You end up displacing not going to the doctor.”

Turley argues that AI systems like ChatGPT aren't meant to take jobs away from medical professionals, but to empower patients - especially in places where access to care is limited. “You end up democratizing the ability to get a second opinion,” he said. “Very few people have that resource or know to take advantage of a resource like that.”

ChatGPT in the hands of medical professionals

This kind of support isn't just for patients. Doctors themselves are already using ChatGPT to double-check their thinking or gain new perspectives. But for AI to truly earn trust in medicine, Turley says it's not enough for the models to simply be good: “There’s work to make the model really, really good – and there’s also work to prove that the model is really good.”

Ad
Ad

Both users and, especially, professionals need a clear understanding of where these models are reliable and where they're not. Until there's solid proof and systematic testing, trust will remain a major challenge for AI-powered medicine. As models get better, Turley warns, it actually becomes harder to spot and communicate their limits. “Once it gets to human and then superhuman level performances, it’s hard to frame exactly where it will fall short," he said.

Still, Turley sees enormous potential: "That opportunity is one of the things that gets me up in the morning." Alongside education, he believes healthcare is where AI could have the biggest impact on society.

Benchmarks are one thing - real-world medicine is another

OpenAI points out in a recent benchmark that its latest models, GPT-4.1 and o3, outperform doctors' responses in medical dialogue scenarios. At the same time, new systems like Microsoft's MAI-DxO are showing that orchestrated AI models can even surpass experienced physicians in complex diagnoses - both in terms of accuracy and cost efficiency.

But the tests themselves are highly controlled, and direct comparisons to real clinical settings are limited. Just because AI systems perform well in benchmarks doesn't mean they're proven in real-world interactions. For example, a study from the University of Oxford found that people sometimes made worse medical decisions with AI assistance than a control group using a search engine, often because the conversation with the chatbot broke down. Yet there are also regular reports from users who say ChatGPT helped diagnose a rare disease after years of searching for answers.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • OpenAI's Nick Turley says AI systems like ChatGPT are not intended to replace doctors, but to help people who might otherwise avoid seeking medical care, by making second opinions and basic support more accessible.
  • Both patients and doctors are already using ChatGPT in medical contexts, but Turley stresses that earning trust will depend on thorough proof and systematic testing, since it's difficult to define the limits of AI as it approaches or surpasses human-level performance.
  • While OpenAI and Microsoft report that their latest AI models outperform doctors in controlled benchmarks, real-world results are mixed, with some studies showing users sometimes make worse decisions with AI, even as others credit tools like ChatGPT with helping diagnose rare conditions.
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.