Evolutionary and thermodynamical features of musculoskeletal disease mutations is human intrinsically disordered protein regions

Lead Research Organisation: University of Sheffield
Department Name: Animal and Plant Sciences

Abstract

There are proteins that are well characterized with regard to their three dimensional structure. In particular it is known that parts of them act as "functional" units, e.g. active and binding sites or functional domains. However, within proteins, regions of intrinsically disorder occur and these are characterized by a lack of a well-defined three-dimensional structure. Although these disordered regions do not show a particular higher conformational state, they are known to be functionally important, such as through their involvement in protein-protein interactions or DNA/RNA binding. In a recent work we have shown that there is ongoing positive selection that contributes substantially to the evolution of human long intrinsically disordered protein regions. Furthermore, these protein regions are enriched in posttranslational modification sites as well as regions and motifs (annotated sequence stretches of biological importance), but surprisingly disease mutations tend to occur much less frequently in disordered regions (Uversky et al., 2014), with the exception of mutations associated with musculoskeletal diseases.

This timely project aims to understand why disease mutations tend to be less frequent in disordered protein regions. The focus of this project will lie on the exceptional group of mutations involved in musculoskeletal diseases that are enriched in disordered protein regions by comparing them to those involved in other disease groups. Fundamental will be a novel genomic comparative approach currently developed in the Gossmann lab targeted at the identification of the evolutionary properties of disease-associated sites. Furthermore, in collaboration with Daniel Rigden from the University of Liverpool, we will conduct molecular dynamic simulations to investigate three dimensional features of disease associated mutations on protein flexibility and protein-ligand interactions. Furthermore we will exploit machine learning approaches to predict protein disorder on the single site residue effects. Experimental evidence and the underlying mechanistics of disease candidate sites can then functionally be tested in a fly model in collaboration with the Mirre Simons lab at the University of Sheffield. This will ultimately gain insights into whether disease mutations are genuinely less likely to occur in disordered protein regions or whether our lacking understanding of disease properties associated with disordered protein regions has led to an under-annotation in the respective databases.

For this highly innovative, collaborative and interdisciplinary PhD the respective candidate should have a strong background in biology, molecular biology and genetics as well basic programming knowledge or at least a strong interest in computational approaches to investigate fundamental biological problems. A background in bioinformatics is of advantage, however all necessary approaches will be taught during the duration of the PhD. This project will take advantage of multiple biological "big" data sets, such as the 1000-Genome project, Uniprot, large-scale mammalian phylogenies and their respective whole genome information, as well as PDB and several secondary databases.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
MR/S502546/1 01/10/2018 30/09/2023
2114913 Studentship MR/S502546/1 01/10/2018 28/11/2022 Alex Bardill