UK startup turns planetary biodiversity into AI-generated drug candidates
Key Points
- Basecamp Research, in collaboration with Nvidia and Microsoft, with UPenn, Johns Hopkins, Oxford, Stanford, UC Berkeley, and CRG Barcelona, developed "Eden," a family of AI models trained on evolutionary data from over one million microbial species to generate potential gene therapies and antibiotics.
- The system designed enzymes capable of precise DNA insertions that are potentially safer than CRISPR, as well as antimicrobial peptides that proved effective against drug-resistant bacteria in 97 percent of tests.
- While the initial results show high-functional hit rates, the researchers emphasize that these are early-stage candidates requiring significant optimization regarding stability and toxicity before they can be used clinically.
UK company Basecamp Research has developed AI models together with researchers from Nvidia and Microsoft that generate potential new therapies against cancer and multidrug-resistant bacteria from a database of over one million species.
An international research team including Nvidia, Microsoft, and the University of Pennsylvania has applied AI to a biological collection of more than one million species to generate potential new gene therapies. The AI models, named "Eden" (Environmentally-Derived Evolutionary Network), use evolutionary information from microbial samples collected worldwide by UK company Basecamp Research.
The largest model in the Eden family comprises 28 billion parameters and was trained on 9.7 trillion nucleotide tokens, according to the accompanying research paper. Notably, no human, lab, or clinical data was included in the pre-training dataset.
"What we're mapping here is organisms all over the planet [and] how they've evolved," John Finn, Basecamp's chief scientific officer, told the Financial Times. Machine learning models could pick out "very, very hidden relationships between all these different species and 4 billion years of evolution," he said.
AI-programmable gene insertion for thousands of diseases
The research includes what Basecamp describes as the first demonstration of AI-designed enzymes, called Large Serine Recombinases (LSRs), capable of performing precise large gene insertions in humans.
Unlike CRISPR-based methods, which cause DNA double-strand breaks and thus carry potential risks, LSRs can insert large DNA sequences of over 30,000 base pairs without damaging the DNA. This makes them potentially safer for therapeutic applications.
The research team tested the model on disease-associated genomic loci, meaning locations in the genome linked to conditions such as muscular dystrophy (DMD), hemophilia (F9), and Fanconi anemia (FANCC). For all tested sites, Eden generated multiple active enzymes when prompted with only 30 base pairs of DNA as input. The functional hit rate was 63.2 percent.
In primary human T cells, immune cells taken directly from the body, the AI-generated enzymes achieved therapeutically relevant integration levels. The researchers inserted so-called CAR constructs, which are used in cancer therapies. 50 percent of the tested variants were active.
New antibiotics against multidrug-resistant pathogens
Additionally, Eden generated a library of antimicrobial peptides, short protein chains capable of killing bacteria. These target WHO critical-priority multidrug-resistant pathogens, against which conventional antibiotics no longer work.
97 percent of the 33 tested peptides showed activity, with top candidates achieving single-digit micromolar potency against pathogens such as Acinetobacter baumannii. The lower this value, the more potent the antibiotic. The researchers emphasize that, to their knowledge, "this marks the first instance a DNA foundation model has been used directly for peptide and antibiotics design with proven potency in ground-truth experiments against targets of interest."
Researchers urge caution
The paper's authors themselves point to significant limitations. While the AI-generated enzymes are "potent functional hits," they acknowledge that these designs "will require downstream optimization before becoming clinic-ready medicines." Future work will need to "integrate reinforcement learning to refine control over activity and specificity, alongside comprehensive assessment of integration efficiencies and off-target profiles in relevant human cell populations."
Regarding the antimicrobial peptides, the researchers also concede that "these de novo peptides exhibit potent activity" but "remain early-stage candidates requiring further optimization for stability, toxicity, and pharmacokinetic properties before clinical application." Pharmacokinetics describes how a drug is absorbed, distributed, and metabolized by the body.
Recently, Nvidia announced plans to invest $1 billion over five years in a new laboratory with Eli Lilly to accelerate AI deployment in the pharmaceutical industry.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now