21-BBSRC/NSF-BIO: Modeling of protein interactions to predict phenotypic effects of genetic mutations

Lead Research Organisation: Imperial College London
Department Name: Life Sciences

Abstract

Proteins are biological molecules that are the machinery of life. Individually a protein typically consists of hundreds of atoms that fold into a complicated 3D shape. Often proteins perform their function not in isolation but via docking to other proteins in a protein-protein complex. Changes in the genetic DNA can result is an alteration of a few atoms in the protein (known as a missense variant) and these sometime disrupt the structure and thus the function of the molecule. Understanding the 3D shape of docked proteins and additionally using this information to interpret these missense variants provide major insights into fundamental biology and furthermore in humans can provide an explanation for diseases caused by changes to a person's gene.

Although experimental methods has revealed the structure of several protein complexes, computational approaches can markedly extend the number of models that can be examined. Furthermore, computational approaches using the structural information can assist in evaluating if a missense variant is likely to impact on the structure and hence on the function of the protein complex.

These considerations have led to a collaboration between Imperial College London and the University of Kansas to develop a web-based resource GWYRE. The development of GWYRE involves (i) computational prediction by Imperial of the shape of an individual protein, (ii) the computational docking into a complex at Kansas, (iii) the evaluation of the impact of a missense variant at Imperial and (iv) the joint development and dissemination of the web-based resource GWYRE for community use.

Under this grant we will enhance the above approach. In particular there has been major developments in enhanced machine learning that has markedly improved the computational prediction of protein structure and complexes. We will build upon these developments and furthermore apply machine learning for enhanced prediction of the impact of missense variants. The user-interface to GWYRE will be extended. The GWYRE database will be extended to provide information for several model organisms which are widely studied by the bioscience community.

Technical Summary

The ability to model protein-protein interactions (PPIs) is key to understanding biology at the molecular level with structures of protein complexes being essential for interpretation of genetic variants. The aim of this grant in to build upon our current BBSRC/NSF collaboration between Imperial College London and the University of Kansas. This has led to the development of the web-accessible GWYRE database which provides experimental and predicted 3D structures for binary protein complexes onto which missense variants are mapped. The approach followed will be:
1) The high-throughput structural modeling of protein interactome. We will continue development of computational approaches to structural characterization of PPI. The approaches will incorporate advances in application of Deep Learning to structure prediction. We will develop an approach to integrate predictions from AlphaFold with our methods. Our modeling approach will include multimeric complexes, which are essential for characterization of the interactome and assessment of the genetic variants.
2) The assessment of phenotypic effects of genetic variation. We will further develop and implement a pipeline for mapping amino acid variants onto protein structures and complexes to assess the impact on protein interaction. A new approach to predict the impact of a missense variant on protein structure, including PPIs, will be developed.
3) The development and dissemination of a public resource (GWYRE) for structural characterization of protein interactome and phenotypic effects of genetic variants. The resource will incorporate structurally refined protein complexes for model organisms, annotated by the phenotypic effects of missense variants. The user interface will allow search by various descriptors of the scope, biological nature, molecular mechanisms, and modeling characteristics. The resource population and updates will run at The University of Kansas (structural characterization of PPI)

Publications

10 25 50