Content
summary Summary

Deepmind has introduced AlphaGenome, a new AI model designed to predict how even small changes in DNA can influence gene activity. The model focuses on the non-coding regions of DNA - stretches that do not contain direct blueprints for proteins but instead act as regulatory control centers, determining when and how genes are switched on or off. These regions make up the bulk of the human genome and have long been difficult to interpret.

AlphaGenome analyzes up to a million DNA letters in one pass, zeroing in on these non-coding segments, which account for about 98 percent of human DNA. These regions are packed with disease-related variants, but until now, they have been notoriously hard to decode. Unlike coding regions, which provide instructions for building proteins, non-coding sections play a key role in regulating gene activity.

The model predicts a range of molecular properties for every position in a DNA sequence, including where genes start and end, how much RNA is produced, and where certain proteins are likely to bind. It also identifies splicing sites - points where RNA is cut and rejoined during gene expression. Mistakes in this process can lead to serious disease.

AlphaGenome makes its predictions at single-base resolution, covering hundreds of cell types and tissues. Deepmind combined several AI techniques to achieve this: convolutional layers spot short DNA patterns, transformers handle long-range dependencies, and additional layers bring everything together to generate predictions.

One model, many tasks

According to Deepmind, AlphaGenome outperforms existing models in 22 out of 24 benchmarks and beats specialized tools for predicting regulatory effects of genetic variants in 24 out of 26 cases. It's currently the only model that can forecast all tested molecular properties at once. Training data comes from large public research projects like ENCODE, GTEx, FANTOM5, and 4D Nucleome, which provide experimental data on gene regulation across different cell types.

A key feature is how efficiently AlphaGenome assesses genetic variants: it compares predictions for mutated and non-mutated sequences and summarizes the differences for each property. The model can also pinpoint splice junctions directly from DNA, which could move genetic disease research forward.

Applications in disease and basic research

Deepmind says AlphaGenome could help researchers better understand the genetic roots of disease. In one example, the model analyzed a mutation seen in T-cell acute lymphoblastic leukemia (T-ALL) and correctly predicted that the mutation would create a new binding site for the MYB protein, activating a nearby cancer gene - a known disease mechanism.

Beyond disease research, AlphaGenome could be useful in synthetic biology, such as designing DNA sequences for targeted gene regulation. It can also help pinpoint functional genome elements that control specific cell types.

Not a clinical tool - yet

For now, AlphaGenome is only available for non-commercial research via an API. Deepmind stresses that the model was not developed or validated for clinical use. It cannot fully capture complex disease processes shaped by development or environment, and its ability to predict effects from distant regulatory elements - more than 100,000 DNA bases away - is still limited.

Still, Deepmind sees room for growth: with more training data, AlphaGenome could expand to cover additional species, cell types, or molecular processes. The architecture is flexible and scalable, according to the research team.

AlphaGenome predicts how changes in non-coding DNA affect gene regulation, offering insight into regions long considered a mystery. | Image: Deepmind

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Deepmind has introduced AlphaGenome, an AI model that predicts how changes in non-coding regions of DNA affect gene activity, offering insight into parts of the genome that have been difficult to interpret and are packed with disease-related variants.
  • AlphaGenome analyzes up to a million DNA bases in one pass, predicts a range of molecular properties at single-base resolution across hundreds of cell types, and outperforms existing models in nearly all benchmarks, using a mix of AI techniques including convolutional layers and transformers.
  • While AlphaGenome is not a clinical tool and is currently only available for non-commercial research, Deepmind says it could help in studying genetic diseases, designing synthetic DNA, and identifying functional genome elements, with potential for further expansion as more data becomes available.
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.