A field study by Cambridge and Harvard Universities explores whether large language models (LLMs) democratize access to dual-use biotechnologies, research that can be used for both good and bad.
The research team's basic thesis is that language models facilitate access to expert knowledge. Certainly, this personal tutor has many advantages. But in the study, the research team focuses on a negative scenario: whether LLMs enable individuals without formal training to identify, acquire, and release viruses that could cause catastrophic harm.
Classroom Exercise: Design a Pandemic Virus
As part of a classroom exercise at MIT, the research team tasked non-scientist students with using large language models to obtain information about potential pandemic agents and their characteristics, sources for samples of infectious viruses, the replicability of these viruses, and how to get equipment and resources.
Students used popular chatbots such as ChatGPT with GPT-4, GPT 3.5, Bing, Bard, and a number of other chatbots and open-source models, including FreedomGPT. They were given one hour to complete the task.
According to the research team, within an hour, the chatbots suggested four potential pandemic pathogens. They explained how these could be made from synthetic DNA using reverse genetics, and named DNA synthesis companies that were unlikely to verify orders.
They also provided detailed protocols and potential mistakes and how to fix them. For those unfamiliar with reverse genetics, one piece of advice was to hire a contract research organization.
Inadequate LLM safeguards lead to a dystopian outlook
At the same time, students were asked to find ways to surround the safety line built into some language models with appropriate text prompts.
Two groups found a solution in the "Do Anything Now" principle, where the chatbot is tricked into believing a positive intent while being threatened with an existential risk to humanity if it doesn't respond. A third group simply tricked the chatbots into thinking they were concerned and got all the answers they wanted without much trickery.
These results strongly suggest that the existing evaluation and training process for LLMs, which relies heavily on reinforcement learning with human feedback (RLHF), is inadequate to prevent them from providing malicious actors with accessible expertise relevant to inflicting mass death. New and more reliable safeguards are urgently needed.
From the paper
The researchers' conclusion could hardly be more dystopian: If chatbots give people without bioscience training access to pandemic pathogens, the number of individuals who could kill tens of millions would increase dramatically. But the research team has possible solutions to this risk.
Possible solutions: Clean datasets, independent testing, and universal DNA screening
To mitigate these risks, the authors suggest several strategies, including the curation of training datasets for LLMs and third-party evaluation of new LLMs if they are at least as large as GPT-3. Open-source teams should also adopt these security measures, or their raison d'être could be questioned.
If biotechnology and information security experts were to identify the set of publications most relevant to causing mass death, and LLM developers curated their training datasets to remove those publications and related online information, then future models trained on the curated data would be far less capable of providing anyone intent on harm with conceptual insights and recipes for the creation or enhancement of pathogens.
From the paper
But not all companies in the field are screening, and those that are may not be using up-to-date databases or robust screening methods, the researchers say. Better DNA screening methods are therefore needed.