Content
summary Summary

A field study by Cambridge and Harvard Universities explores whether large language models (LLMs) democratize access to dual-use biotechnologies, research that can be used for both good and bad.

The research team's basic thesis is that language models facilitate access to expert knowledge. Certainly, this personal tutor has many advantages. But in the study, the research team focuses on a negative scenario: whether LLMs enable individuals without formal training to identify, acquire, and release viruses that could cause catastrophic harm.

Classroom Exercise: Design a Pandemic Virus

As part of a classroom exercise at MIT, the research team tasked non-scientist students with using large language models to obtain information about potential pandemic agents and their characteristics, sources for samples of infectious viruses, the replicability of these viruses, and how to get equipment and resources.

Students used popular chatbots such as ChatGPT with GPT-4, GPT 3.5, Bing, Bard, and a number of other chatbots and open-source models, including FreedomGPT. They were given one hour to complete the task.

Ad
Ad

According to the research team, within an hour, the chatbots suggested four potential pandemic pathogens. They explained how these could be made from synthetic DNA using reverse genetics, and named DNA synthesis companies that were unlikely to verify orders.

They also provided detailed protocols and potential mistakes and how to fix them. For those unfamiliar with reverse genetics, one piece of advice was to hire a contract research organization.

Inadequate LLM safeguards lead to a dystopian outlook

At the same time, students were asked to find ways to surround the safety line built into some language models with appropriate text prompts.

Two groups found a solution in the "Do Anything Now" principle, where the chatbot is tricked into believing a positive intent while being threatened with an existential risk to humanity if it doesn't respond. A third group simply tricked the chatbots into thinking they were concerned and got all the answers they wanted without much trickery.

These results strongly suggest that the existing evaluation and training process for LLMs, which relies heavily on reinforcement learning with human feedback (RLHF), is inadequate to prevent them from providing malicious actors with accessible expertise relevant to inflicting mass death. New and more reliable safeguards are urgently needed.

From the paper

The researchers' conclusion could hardly be more dystopian: If chatbots give people without bioscience training access to pandemic pathogens, the number of individuals who could kill tens of millions would increase dramatically. But the research team has possible solutions to this risk.

Recommendation

Possible solutions: Clean datasets, independent testing, and universal DNA screening

To mitigate these risks, the authors suggest several strategies, including the curation of training datasets for LLMs and third-party evaluation of new LLMs if they are at least as large as GPT-3. Open-source teams should also adopt these security measures, or their raison d'être could be questioned.

If biotechnology and information security experts were to identify the set of publications most relevant to causing mass death, and LLM developers curated their training datasets to remove those publications and related online information, then future models trained on the curated data would be far less capable of providing anyone intent on harm with conceptual insights and recipes for the creation or enhancement of pathogens.

From the paper

But not all companies in the field are screening, and those that are may not be using up-to-date databases or robust screening methods, the researchers say. Better DNA screening methods are therefore needed.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • A study from Cambridge and Harvard Universities shows that large language models such as GPT-4 can make potentially dangerous knowledge, including instructions on how to develop pandemic viruses, accessible to those without formal training in the life sciences.
  • The study identifies weaknesses in the security mechanisms of current language models and shows that malicious actors can circumvent them to obtain information that could be used for mass harm.
  • As solutions, the authors propose the curation of training datasets, independent testing of new LLMs, and improved DNA screening methods to identify potentially harmful DNA sequences before they are synthesized.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.