Google's medical language model "Med-PaLM 2" enters pilot phase with first customers

Update

Google is rolling out Med-PaLM 2 on a limited basis for initial testing.

Update May 17, 2023:

Google has published the Med-PaLM 2 paper.

Update April 14, 2023:

Google Cloud announces that Med-PaLM 2 will be rolled out to select Google Cloud customers for a "limited test" in the coming weeks. The goal, the company says, is to explore safe, responsible and meaningful use scenarios.

The medical language model could "facilitate rich, informative discussions, answer complex medical questions, and find insights in complicated and unstructured medical texts," according to Google. It can also generate short and long answers to medical questions and create summaries from internal documentation and data sets, as well as from scientific sources.

According to Google, Med-PaLM 2 is the first language model to achieve expert-level performance on U.S. Medical Licensing Examination (USMLE)-style questions with more than 85 percent accuracy. In the MedMCQA dataset, which includes questions from India's AIIMS and NEET medical exams, it achieved a "pass rate" of 72.3 percent.

Original article, March 18, 2023:

Google's medical language model Med-PaLM 2 passes exam questions

Med-PaLM is Google's variant of the PaLM language model optimized for medical questions. The latest version is designed to answer medical questions reliably at an expert level.

Last December, Google unveiled Med-PaLM, a version of Google's giant PaLM (Pathways Language Model) language model optimized for answering medical questions. Med-PaLM was developed using a special soft prompting method combined with responses to medical prompts written by four clinicians.

Recommendation

AI research

LLMs can outperform neuroscientists at predicting research outcomes

Med-PaLM performed at the level of medical professionals in most of the benchmarks tested. Potentially harmful responses were generated 5.9 percent of the time, compared with 5.7 percent for human experts, the research team said.

Med-PaLM was also the first AI model to potentially pass the U.S. Medical Licensing Examination (67.2 percent correct when tested with "licensing-style questions," 60 percent required), correctly answering multiple-choice and open-ended questions and reasoning about its answers.

Med-PaLM 2 is even more accurate - but still has gaps

As part of Google Health's "The Check Up" event, Google announced the continued development of Med-PaLM. The current version, Med-PaLM 2, can answer medical exam questions at an "expert doctor level" and is accurate 85 percent of the time.

Eine Grafik zeigt die Leistung von Med-PaLM 2 bei medizinischen Zulassungsfragen im Vergleich zu anderen Sprachmodellen. Med-PaLM 2 erreicht über 85 % richtige Antworten, alle anderen Sprachmodelle bleiben unter der kritischen 60 %-Marke. — Med-PaLM 2's performance on medical licensing questions compared to other language models. Med-PaLM 2 achieves over 85 percent correct answers, while all other language models remain below the critical 60 percent mark. | Image: Google AI

This means that Med-PaLM 2 achieves an 18 percent increase in performance over its predecessor and is well above the level of comparable language models in medical tasks. However, the team still sees significant room for improvement to ensure that Med-PaLM 2 meets Google's quality standards. As for technical changes from Med-PaLM 1, the team is tight-lipped.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Med-PaLM 2 was tested against 14 criteria, including scientific factuality, accuracy, medical consensus, reasoning, bias, and harm, evaluated by clinicians and non-clinicians from diverse backgrounds and countries. The team found "significant gaps when it comes to answering medical questions," without elaborating on the shortcomings.

Working with research teams, Google plans to further develop Med-PaLM to address these gaps and understand how the language model can improve healthcare. The video below shows the announcement of Med-PaLM 2 starting at approximately 16:30.

Google's medical language model "Med-PaLM 2" enters pilot phase with first customers

Google's medical language model Med-PaLM 2 passes exam questions

LLMs can outperform neuroscientists at predicting research outcomes

Med-PaLM 2 is even more accurate - but still has gaps

Google releases code for HeAR, an AI that analyzes audio to assess health

Google releases MedLM models for healthcare AI applications

Google is field-testing its generative medical language model in a clinical setting

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Google's medical language model "Med-PaLM 2" enters pilot phase with first customers

Google's medical language model Med-PaLM 2 passes exam questions

Med-PaLM 2 is even more accurate - but still has gaps

Share

Bank details