Artificial intelligence can accelerate processes in diagnostics. Dr. Ansgar Lange explains the role AI plays in genomics and sequencing.
The cost of sequencing a genome is falling fast. So fast that within the last few years, a balance in genomics has shifted: while for a long time it was hardware and the actual cost of sequencing that drove the price of a sequenced genome, it's quickly becoming the cost of human resources.
In this article, we're diving into developments in AI and genetics and a new generation of software using AI and machine learning to solve this bottleneck in genomics. In addition, we recently carried out a study aimed at validating the performance of AION, our platform for so-called variant interpretation.
Some background: recent developments in genetic testing and the need for AI
Genetic diseases are 'caused' by mutations or variants within the human genome. A geneticist must find the causative (or pathogenic) variant to diagnose a genetic disease. The human genome consists of about 3 billion base pairs, and these causative variants can be found in any part of the human genome. For a long time, sequencing genomes was a costly process: the first draft of a human genome as part of the Human Genome Project cost 300 million dollars.
In 2006, the cost of sequencing a human genome was estimated at 20-25 million dollars, although this was a hypothetically calculated cost. Luckily, the cost of genome sequencing has steadily declined since the $150 million draft human genome in 2003 – with the most recent pricing in 2022 at $200. It's not hard to imagine what this means for market demand: the steeply falling costs have led to strong growth of the Next Generation Sequencing (NGS) market, with an estimated compound annual growth of over 18% between 2022 and 2030. In other words, genetic testing is becoming more widely available.
These developments mean that the cost driver of genetic sequencing is no longer the sequencing itself. Increasingly the analysis of all this data is becoming a bottleneck for labs. Sequencing a genome or exome generates a long list of variants that need to be interpreted by a specialist, as many of these variants are benign (not harmful). So, in simple terms, generating the data is becoming cheaper every year, whereas the interpretation of this data –finding the pathogenic variant– is still just as labor- and cost-intensive.
This is where a new generation of AI-driven software comes in, supporting those interpreting the data, variant scientists, in speeding up the process of analyzing sequenced genomes or exomes. As more and more countries introduce genetic testing at scale, for instance, through newborn screening programs, AI-driven tools that focus on the current interpretation bottleneck are needed.
The clinical validation of AI-driven variant interpretation tools: our study
AION is one of these platforms: a variant interpretation platform for rare diseases supported by machine learning algorithms. Before diving into its validation study - let's quickly go into how it works. It aims to support variant scientists in this interpretation process that currently is the bottleneck in the rapidly growing sequencing market.
AION comprises two main components: firstly, all mutations or variants are classified into pathogenic or benign – but the causal impact of less than 1% of all mutations is known. A larger number of variants are so-called VUS: variants of uncertain significance. This algorithm supports categorization by extrapolating knowledge based on public data sets and the latest research, by identifying potentially pathogenic variants.
Secondly, a second algorithm matches these potentially harmful variants with the symptoms a patient experiences to rank the most relevant variants for the human expert to look at. This is particularly interesting, remembering that the vast majority of variants are VUS, which provides a direction for further clinical investigation by a human expert.
The study aims to validate the clinical performance of our platform. To do this, we analyzed the data of rare disease patients of the Genomics England 100,000 Genomes Project. Nostos Genomics' computational genomics team ran rare disease cases from the 100,000 Genomes Project on AION and analyzed its performance. Here, the goal was not to find new diagnoses but to measure how the tool performs based on already diagnosed cases, to see how an AI-driven tool performs compared to a human expert.
The results are impressive: AION identified the causative pathogenic variants in more than 91% of cases and over 93% if parent data is available. This means that in more than 9 out of 10 cases, the causative variant was found in the rank of prioritized variants across ages and ethnicities. AI-driven, automated variant interpretation offers clinical performance comparable to that of a human expert. It can support the analysis of clinical genetic tests, which may decrease the time and costs associated with this crucial process due to the interpretation bottleneck described earlier.
Future potential for AI in genetics
In the previous paragraphs, we've covered the necessity for AI-driven decision support tools with the rapid growth of the sequencing industry and the validation of their performance. Further adoption of these tools can lead to wider availability of genetic testing globally. Decreasing sequencing costs can open the door for global healthcare markets to invest in rare disease diagnostics.
However, many low- and middle-income countries do not yet have the clinical genetics expertise to analyze the multitude of variants that would result from sequencing their populations. AI-driven tools must be simple to integrate, given that many laboratories do not have the technical expertise to implement complex computational pipelines and interfaces. In other words: AI decision-support tools democratize genetics expertise, supporting laboratories worldwide with equitable access to high-quality genome interpretation.
Secondly, AI algorithms enable laboratories to push the boundaries of their diagnostic practice by granting variant scientists more time to focus on complex cases. Although variant scientists supported by AI-driven interpretation can solve rare disease cases quicker, a number of patients may not show a clear causative variant.
These complex cases are often characterized by having many variants of uncertain significance (VUS). In-depth investigations into these VUS are performed using functional screens or additional familial sequencing, which may provide sufficient evidence for reclassification. By prioritizing these VUS with a pathogenicity score, AI tools add nuance to the VUS category and indicate the most valuable variants for in-depth investigations, allowing genetics departments to make the best use of resources. The better trained and more sophisticated the algorithm, the better these rankings will be, enabling variant scientists to make even more diagnoses in the future – supported by AI.
- Berger, Bonnie, and Yun William Yu. 2022. “Navigating Bottlenecks and Trade-Offs in Genomic Data Analysis.” Nature Reviews. Genetics, December, 1–16.
- Kris A. Wetterstrand, M. S. 2019. “The Cost of Sequencing a Human Genome.” Genome.gov. NHGRI. March 13, 2019. https://www.genome.gov/about-genomics/fact-sheets/Sequencing-Human-Genome-cost.
- “Next-Generation Sequencing Services Market Report, 2030.” n.d. Accessed January 9, 2023. https://www.grandviewresearch.com/industry-analysis/next-generation-sequencing-ngs-services-market.
- “Press Release.” n.d. Accessed January 9, 2023. https://www.illumina.com/company/news-center/press-releases/press-release-details.html?newsid=8d04df3f-d9c1-4c85-8177-6ea604627ccd.
- Richards, S., N. Aziz, S. Bale, D. Bick, S. Das, J. Gastier-Foster, W. W. Grody, et al. 2015. “Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.” Genetics in Medicine: Official Journal of the American College of Medical Genetics 17 (5). https://doi.org/10.1038/gim.2015.30.