Link to Pubmed [PMID] – 31406981
Mol. Biol. Evol. 2019 Aug;
The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modelling inter-site dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present GEMME (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modelling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of very conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at: www.lcqb.upmc.fr/GEMME/.