Predicting the pathogenicity of a (missense) mutation


Abhishek Sharma

Abhishek Sharma

Abhishek is currently a senior research scientist in AI dept of Illumina, Cambridge. He did his PhD in computer science from Ecole polytechnique and masters in Applied Math from Ecole Centrale Paris. Before that, he also spent time as a researcher at EPFL, INRIA and MPI. In general, he is very interested in ML breakthroughs in real world problems with major societal impact.


Predicting the effect of a (missense) mutation accurately is one of the central problems in biology. Currently, two paradigms dominate benchmarks for missense variant pathogenicity prediction, namely PrimateAI-3D and alpha-missense [1]. Interestingly, both follow totally different approaches in their learned model. While PrimateAI-3D is trained from scratch, alpha-missense [2] follows a pretraining-finetuning strategy. Both approaches rely on MSA, Protein 3D structures as input and attention mechanism in their model architecture. The goal of the project is to explore the alpha-missense pretraining-fine tuning strategy and find (simpler/better) alternatives E.g. [3] outlines a (simpler) alternative to the pre-training part (alphafold2.)

1 . PrimateAI-3D outperforms AlphaMissense in real-world cohorts (Under Review)

  1. Accurate proteome-wide missense variant effect prediction with AlphaMissense Cheng et al. Science, 2023

  2. Efficient and accurate prediction of protein structure using RoseTTAFold2. Minkyung Baek, Ivan Anishchenko, Ian Humphreys, Qian Cong, David Baker, Frank DiMaio Bio arxiv, 2023.