Unlocking the chemical potential of plants: Predicting function from DNA sequence for complex enzyme superfamilies
Lead Research Organisation:
University College London
Department Name: Structural Molecular Biology
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Technical Summary
Our strategy is to integrate powerful data-driven computational approaches with experimental investigation of enzyme function to understand the functions and kingdom-specific expansion of an exemplar complex enzyme superfamily - the triterpene synthases (TTSs). The TTS enzyme superfamily is an ideal test case for our purposes, since these enzymes are able to generate an enormous diversity of cyclized triterpene scaffolds from a single common precursor molecule. Through iterative cycles of computational and experimental investigations we aim to develop sophisticated predictive analytic approaches that will enable us to relate DNA sequence to enzyme function with ever-increasing power and resolution, and in so doing to generate and test hypotheses about enzyme function, mechanisms and evolution. Our aims are to: (1) experimentally determine the chemical diversity encoded by diverse members of the TTS superfamily selected based on our initial CATH-FunFam classification; (2) expand the sequence data for the CATH TTS superfamily and integrate sequence- and structure-based computational approaches to refine our strategies for identifying TTS features implicated in determination of product specificity and for functional classification, and test TTS function predictions; (3) exploit a novel machine learning approach to predict known and novel TTSs; (4) understand TTS function and diversification by determining the product specificities of natural and engineered TTS variants, guided by computational predictions from (1)-(3).
Organisations
People |
ORCID iD |
Christine Orengo (Principal Investigator) |
Publications
Bordin N
(2023)
AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms.
in Communications biology
Goldtzvik Y
(2023)
Protein diversification through post-translational modifications, alternative splicing, and gene duplication.
in Current opinion in structural biology
Nallapareddy V
(2023)
CATHe: detection of remote homologues for CATH superfamilies using embeddings from protein language models.
in Bioinformatics (Oxford, England)