Two years after AlphaFold, researchers in CASP15 reveal the capabilities and limitations of the AI system.
In 1994, Professors John Moult and Krzysztof Fidelis founded the Critical Assessment of Protein Structure Prediction (CASP), a biennial competition to predict protein folds.
Participating research groups use a variety of methods to try to predict protein structures that have already been empirically uncovered by other scientists but have not yet been published.
In December 2020, the team published the results of CASP14: Deepmind's AlphaFold 2.0 predicted the structure of 70 of the nearly 100 protein sequences to be solved in the competition as accurately as experimental methods. This achievement was a breakthrough. Some researchers called the artificial intelligence a solution to the 50-year-old problem of protein folding.
CASP15 poses new challenges for artificial intelligence
The impressive 2020 success presented a challenge to biologist John Moult of the CASP team and his colleagues. "People say, ‘Oh, we don’t need CASP anymore, the problem was solved.’ And I think that’s exactly the wrong way round," Moult said according to a report in Nature.
Instead of simply continuing to collect the best scores for a problem that has already been solved, the team added new challenges and modifications to old ones. These include, for example, predictions about the interactions between proteins and other molecules such as drugs, or about the different shapes some proteins can take.
AlphaFold dominates CASP15 even without Deepmind
The results of CASP15 have now been published. Two years later, AlphaFold still dominates the competition.
Deepmind itself did not participate in this round, but AlphaFold has been open source since 2021 and the most successful participants have integrated Deepmind's AI system into their approaches.
In predicting the shape of individual proteins, participating teams achieved moderate improvements in accuracy. "The accuracy is already so high that it's hard to improve on it," Moult said.
Aside from AlphaFold's proven capabilities, several teams this year also demonstrated how the AI system can be used with modifications to predict protein interactions. Compared to CASP14, systems using such AlphaFold variants have made significant improvements and are slowly approaching the accuracy of experimental methods.
AlphaFold needs to get better and language models could help
But to achieve the accuracy of the time-consuming experimental methods, further refinements are needed, say researchers involved. One possibility for innovation are language models like Meta's ESMFold, which predict protein structures instead of words.
In direct comparison with AlphaFold, these methods were well behind, but they could be useful for predicting how mutations change a protein's structure, according to researchers.
"The low-hanging fruit has been picked," says Mohammed AlQuraishi, a computational biologist at Columbia University in New York City. "Some of the next problems are going to be harder."