Statistical design of interactions between proteins that are both novel and specific

Lead Research Organisation: University of Cambridge
Department Name: Chemistry

Abstract

The ability of proteins to associate with each other in specific interactions is crucial to the functioning of vast multitudes of biological processes. In the crowded molecular soup that makes up the intracellular environment, a single protein will encounter many other proteins with which they could potentially interact. The interactions that help cells reproduce and survive may range from fleeting connections to the formation of long lasting complexes; however it is the specificity of these interactions that allows the exquisite levels of regulation observed in so many biological systems. I will focus on how proteins evolve specific interactions, and how we can elucidate interaction partners from the vast amounts of genomic data currently being generated. Each and every cell is packed with proteins, the interactions between these proteins form the foundation of almost all cellular processes. I am interested in the ways that proteins fit together to form complexes, and the constraints that complex formation imposes on the evolution of the sequences of interaction partners. If one amino acid on the surface of protein A changes shape, in order for protein B to bind to protein A an amino acid on the surface of B may also have to change - this is an example of a compensatory mutation. I will develop statistical and mathematical analyses that extract correlations between different residues of a protein of interest from alignments of orthologous and paralogous proteins. Detecting compensatory mutations in sequence data provides information about protein structure and function, and the specificity of protein-protein interactions. The specificity of binding interactions is crucial to the proper functioning of cellular processes, and determining the evolutionary rules that govern the specificity of protein interactions will inform the rational design of novel specific interactions between proteins. Our ability to understand and thus engineer the molecular determinants of specificity is vital to efforts to design and engineer effective drugs and other bio-molecules. There are numerous potential applications across a range of different industries, for example synthetic biology approaches to energy production, and the development of biomimetic materials.

Planned Impact

I will make every effort to disseminate both the nature and results of my work beyond the confines of the international academic community. It is likely that there would be a high level of interest from industry in the ability to use statistical analyses to design novel interaction specificities. The Technology Transfer Group in the MRC Head Office is directly responsible for the management of exploitation of results from the council's laboratories and works in partnership with scientists to identify opportunities and develop and execute exploitation strategies. I would work closely with this group to ensure the protection of intellectual property. This might involve the patent process, to secure a proprietary position and thereby enhance the value of the opportunity for a prospective industrial partner. However, this process can be long, expensive, and often arduous and so will not be embarked upon lightly. I would also be interested in embarking on industrially funded collaborations. I am also happy to ensure that novel materials produced by this research are distributed to fellow academics under an appropriate Materials Transfer Agreement ensuring that the recipient does not commercially exploit without reference to the originator. The MRC-LMB has extensive experience of managing and exploiting results from the laboratory, and I would seek to benefit from this expertise. For example, antibody engineering technology was pioneered at the MRC-LMB, and the method of humanization of monoclonal antibodies through engineering the proteins was subsequently protected by a patent. Start up companies have also been founded by academics at the MRC-LMB in order to fully realize the potential of their scientific results. For example Cambridge Antibody Technology Ltd. (CAT) has disseminated a phage antibody screening technology that allows in vitro selection of novel antibodies from human repertoires to the wider public.

Publications

10 25 50
 
Description Working as part of a team of international collaborators I found that we are able to identify statistical patterns in the mutations of amino acids within protein sequences, and use this information to predict the three dimensional structure of the folded molecule. We use a Bayesian inference approach to analyze large sequence alignments. We also find that amino acid residues whose mutation patterns are correlated provide information about alternative conformations of individual proteins, in addition to information about residues involved in dictating the specificity of protein protein interactions.
Exploitation Route My findings will be used by researchers both within academia and industry. I was fortunate to collaborate with a team from Roche Pharmaceuticals who are keen to use the outcomes of our research to analyze large biomedical datasets. In addition I co-founded a small company that specializes in analyzing large, high dimensional datasets for commercial organizations in different contexts
Sectors Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description My findings will be used by researchers both within academia and industry. I was fortunate to collaborate with a team from Roche Pharmaceuticals who are keen to use the outcomes of our research to analyze large biomedical datasets. In addition I co-founded a small company that specializes in analyzing large, high dimensional datasets for commercial organizations in different contexts. Recently DeepMind research used the findings from this EPSRC funded research to build their alphafold protein structure prediction platform that dominated the 2018 CASP protein structure prediction contest.
First Year Of Impact 2013
Sector Creative Economy,Digital/Communication/Information Technologies (including Software),Environment,Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
Impact Types Societal,Economic

 
Description Marie Curie fellowship
Amount € 100,000 (EUR)
Organisation Marie Sklodowska-Curie Actions 
Sector Charity/Non Profit
Country Global
Start 01/2015 
End 12/2018
 
Title EVfold 
Description Online server that allows users to predict tertiary protein structure 
Type Of Material Improvements to research infrastructure 
Year Produced 2012 
Provided To Others? Yes  
Impact None yet noted 
URL http://evfold.org/evfold-web/evfold.do
 
Title EVfold 
Description Uses analysis of the correlation structure of large sets of protein sequences to predict tertiary protein structure from protein sequence data alone. 
Type Of Material Computer model/algorithm 
Year Produced 2011 
Provided To Others? Yes  
Impact Widely used by other research groups 
URL http://evfold.org/evfold-web/evfold.do
 
Description Predicting functionally important residues 
Organisation Princeton University
Country United States 
Sector Academic/University 
PI Contribution Original research ideas, mathematical calculations and numerical simulations.
Collaborator Contribution Research ideas, funding of research assistants, funding of laboratory experiments to test the hypotheses generated by the calculations.
Impact Publication in genetics
Start Year 2012
 
Description Predicting tertiary protein structure 
Organisation Harvard University
Department Harvard Medical School
Country United States 
Sector Academic/University 
PI Contribution Original research ideas, algorithms and expertise.
Collaborator Contribution Provided funding to enable other members to join the research team and to supply computational equipment and working facilities.
Impact Publications in PLoS ONE and Cell.
Start Year 2011
 
Description RMT 
Organisation Harvard University
Country United States 
Sector Academic/University 
PI Contribution Original research ideas, mathematical calculations and numerical simulations.
Collaborator Contribution Original research ideas, mathematical calculations and numerical simulations. The collaboration also received additional funding via a grant made by roche pharmaceuticals to Harvard University.
Impact Publication in PRX.
Start Year 2011
 
Company Name Aptamex Limited 
Description Aptamex Limited is a company that develops biological sequence informatics tools and pipelines. 
Year Established 2015 
Impact Not yet available
 
Company Name Onto.it Holdings Ltd 
Description Provides analysis of large, high-dimensional datasets for commercial organisations. 
Year Established 2014 
Impact Not yet available
Website http://www.ontoitsoftware.com