Companies like Google are developing language models optimized for medical purposes. Microsoft believes that GPT-4 is sufficient.
According to Microsoft, large language models can help speed up medical processes by, for example, structuring "large unstructured data" that currently requires time-consuming manual processing.
As an example, Microsoft cites the faster development of cancer drugs, where many clinical trials would have to be abandoned due to insufficient recruitment. Billions of dollars would be wasted in lengthy processes.
Large language models such as GPT-4 could significantly accelerate such processes by efficiently abstracting patient information from large clinical texts. The impact of language models here would be similarly transformative to that of programming or productivity applications.
GPT-4 achieves SOTA results without special medical training
Although GPT-4 was trained only on generic Internet data and not on specific medical data, it was able to structure complex clinical studies according to specified criteria. In this respect, it outperforms current systems such as Criteria2Query, even though they were developed specifically for this task.
OpenAI's large language model could achieve expert-level performance on medical question-answer datasets such as MedQA (USMLE exam) without requiring "costly task-specific fine-tuning or intricate self-refinement", according to the report.
Microsoft has also introduced language models such as BioGPT specifically for medical tasks, but is now making it clear that it will rely primarily on GPT-4 in the future.
GPT-4 could also structure patient data sets on a large scale, for example in cancer research. The model could act as a kind of super-organizer, enabling the use of real-world data on an unprecedented scale.
Although pretrained on general web content, GPT-4 has demonstrated impressive competence in biomedical tasks straightaway and has the potential to perform previously unseen natural language processing (NLP) tasks in the biomedical domain with exceptional accuracy.
Microsoft
Toward evidence-based precision medicine
LLMs could also serve as universal annotators, supporting the training of other models by generating labeled examples from unstructured data or finding cause-and-effect relationships.
Another application for AI in medicine would be multimodal models that can process medical images that contain genetic, protein, and other types of biological data in addition to text.
Microsoft is developing LLaVA-Med, a sort of chatbot for biomedical imaging data available to medical professionals. Google also recently unveiled Med-Palm M, a multimodal medical model that can solve medical tasks in many domains and offers a chat mode.
The ultimate goal, according to Microsoft's research team, is "precision health copilots" that can assist anyone involved in biomedical processes. They would provide a real-time view of large amounts of health data, accelerate care and new discoveries, and ensure a closer connection between clinical research and care.
Any clinical observation could be used immediately to update the patient's health status. This would enable physicians and caregivers to make decisions based on the latest and most comprehensive evidence.
"This vision embodies the dream of evidence-based precision health. Generative AI, including large language models, will play a pivotal role in propelling us towards this exciting and transformative future."