summary Summary

A Stanford study shows that ChatGPT outperforms medical students on complex case-based questions, prompting a rethink of medical education.

Researchers at Stanford have found that ChatGPT can outperform first- and second-year medical students in answering complex clinical care questions.

The study, published in JAMA Internal Medicine, highlights the growing influence of AI on medical education and practice and suggests that adjustments in teaching methods may be needed for future physicians.

"We don't want doctors who were so reliant on AI at school that they failed to learn how to reason through cases on their own," says co-author Alicia DiGiammarino, education manager at the School of Medicine. "But I'm more scared of a world where doctors aren't trained to effectively use AI and find it prevalent in modern practice."


AI beats medical students

Recent studies have demonstrated ChatGPT's ability to handle multiple-choice questions on the United States Medical License Examination (USMLE). But the Stanford authors wanted to examine the AI system's ability to handle more difficult, open-ended questions used to assess clinical reasoning skills.

The study found that, on average, the AI model scored more than four points higher than medical students on the case report portion of the exam. This result suggests the potential for AI tools like ChatGPT to disrupt traditional teaching and testing of medical reasoning through written text. The researchers also noted a significant jump from GPT-3.5, which was "borderline passing" on the questions.

ChatGPT and other programs like it are changing how we teach and ultimately practice medicine.

Alicia DiGiammarino

Despite its impressive performance, ChatGPT is not without its shortcomings. The biggest danger is invented facts or so-called hallucinations or confabulations. This has been significantly reduced in OpenAI's latest model, GPT-4, which is available to paying customers and via API, but it is still very much present.

You can imagine how even very sporadic errors can have dramatic consequences when it comes to medical topics. However, embedded in an overall curriculum with multiple sources of truth, this seems like a much smaller problem.

Stanford's School of Medicine cuts students' line to ChatGPT in exams

Concerns about exam integrity and ChatGPT's influence on curriculum design are already being felt at Stanford's School of Medicine. Administrators have switched from open-book to closed-book exams to ensure that students develop clinical reasoning skills without relying on AI. But they have also created an AI working group to explore the integration of AI tools into medical education.


Beyond education, there are other areas where AI can have a significant impact on healthcare. For example, medical AI startup Insilico Medicine recently administered the first dose of a generative AI drug to patients in a Phase II clinical trial.

Google is field-testing Med-PaLM 2, a version of its large language model PaLM 2 fine-tuned to answer medical questions. Another study suggests that GPT-4 can help doctors answer patients' questions with more detail and empathy. Yes, you read that right: more empathy.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • A study by Stanford researchers shows that ChatGPT outperforms first- and second-year medical students in answering challenging clinical care exam questions.
  • ChatGPT's performance raises questions about the future of medical education and the role of AI in teaching and testing medical reasoning.
  • Stanford's School of Medicine is already addressing these concerns by considering how to integrate AI tools into the curriculum while maintaining the integrity of the exams and ensuring that students develop essential skills.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.