summary Summary

Google is rolling out Med-PaLM 2 on a limited basis for initial testing.

Update May 17, 2023:

Google has published the Med-PaLM 2 paper.

Update April 14, 2023:

Google Cloud announces that Med-PaLM 2 will be rolled out to select Google Cloud customers for a "limited test" in the coming weeks. The goal, the company says, is to explore safe, responsible and meaningful use scenarios.


The medical language model could "facilitate rich, informative discussions, answer complex medical questions, and find insights in complicated and unstructured medical texts," according to Google. It can also generate short and long answers to medical questions and create summaries from internal documentation and data sets, as well as from scientific sources.

According to Google, Med-PaLM 2 is the first language model to achieve expert-level performance on U.S. Medical Licensing Examination (USMLE)-style questions with more than 85 percent accuracy. In the MedMCQA dataset, which includes questions from India's AIIMS and NEET medical exams, it achieved a "pass rate" of 72.3 percent.

Original article, March 18, 2023:

Google's medical language model Med-PaLM 2 passes exam questions

Med-PaLM is Google's variant of the PaLM language model optimized for medical questions. The latest version is designed to answer medical questions reliably at an expert level.

Last December, Google unveiled Med-PaLM, a version of Google's giant PaLM (Pathways Language Model) language model optimized for answering medical questions. Med-PaLM was developed using a special soft prompting method combined with responses to medical prompts written by four clinicians.


Med-PaLM performed at the level of medical professionals in most of the benchmarks tested. Potentially harmful responses were generated 5.9 percent of the time, compared with 5.7 percent for human experts, the research team said.

Med-PaLM was also the first AI model to potentially pass the U.S. Medical Licensing Examination (67.2 percent correct when tested with "licensing-style questions," 60 percent required), correctly answering multiple-choice and open-ended questions and reasoning about its answers.

Med-PaLM 2 is even more accurate - but still has gaps

As part of Google Health's "The Check Up" event, Google announced the continued development of Med-PaLM. The current version, Med-PaLM 2, can answer medical exam questions at an "expert doctor level" and is accurate 85 percent of the time.

Eine Grafik zeigt die Leistung von Med-PaLM 2 bei medizinischen Zulassungsfragen im Vergleich zu anderen Sprachmodellen. Med-PaLM 2 erreicht über 85 % richtige Antworten, alle anderen Sprachmodelle bleiben unter der kritischen 60 %-Marke.
Med-PaLM 2's performance on medical licensing questions compared to other language models. Med-PaLM 2 achieves over 85 percent correct answers, while all other language models remain below the critical 60 percent mark. | Image: Google AI

This means that Med-PaLM 2 achieves an 18 percent increase in performance over its predecessor and is well above the level of comparable language models in medical tasks. However, the team still sees significant room for improvement to ensure that Med-PaLM 2 meets Google's quality standards. As for technical changes from Med-PaLM 1, the team is tight-lipped.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Med-PaLM 2 was tested against 14 criteria, including scientific factuality, accuracy, medical consensus, reasoning, bias, and harm, evaluated by clinicians and non-clinicians from diverse backgrounds and countries. The team found "significant gaps when it comes to answering medical questions," without elaborating on the shortcomings.

Working with research teams, Google plans to further develop Med-PaLM to address these gaps and understand how the language model can improve healthcare. The video below shows the announcement of Med-PaLM 2 starting at approximately 16:30.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • Med-PaLM is a version of Google's large PaLM language model optimized for medical questions.
  • According to Google, the latest version, Med-PaLM 2, provides reliable answers at an "expert doctor level".
  • However, the model still has "significant gaps" that Google plans to fill by working with experts and exploring use cases.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.