Understanding genome function and dysfunction through the evolutionary analysis of intermolecular interactions

Lead Research Organisation: MRC Clinical Sciences Centre


The human genome contains many sites that look like they disrupt important genes and be might therefore lead to disease. Yet typically, they do not. One important reason is that, looking at the genome sequence on a computer screen, we can see the entirety of genetic information, whereas inside a living cell, a particular portion of the genome can be packaged away into hard-to-access structures or covered by proteins so that the deleterious information cannot be read by the cell.
Our research aims to understand which parts of the genome are visible to the cell and which are not and how we can use this information to discriminate mutations that only look dangerous from those that are actively harmful to the cell.
In order to see what the cell sees and thereby understand how some harmful-looking bits of DNA fail to exert their disruptive effects, we first look at where and when different proteins bind to DNA in different cell types. We then link this to genetic differences within the population to reconstruct how the presence of these binding events has allowed deleterious signals to survive in some regions of the genome but not others. As we aim to understand general principles of how this masking works, we consider not only humans but also species such as yeast and the bacterium E. coli, where we can test predictions experimentally.

Technical Summary

Our aim is to understand how intermolecular interactions inside the cell bias the incidence of mutations, affect their persistence, and ultimately shape patterns of natural variation within and between species.
Specifically, our research focuses on two main areas:
First, we study how the binding of structural chromatin proteins such as histones affects the evolution of the underlying sequence through biasing mutation and repair dynamics. For example, are some genomic regions particularly prone to mutations by virtue of their chromatin architecture? To address these issues, we combine genome-wide data on genetic variation with protein-DNA interaction data (from MNase/Chip-Seq assays) in an evolutionary framework to characterize how genome architecture has shaped evolutionary processes over different time scales (within and between species). Importantly, by integrating further knowledge of cellular processes, such as when certain repair pathways are active, our research is geared towards providing explicit pointers to the molecular mechanisms that link chromatin organization and mutation. In the longer term, this will help us understand and eventually interfere with specific mutational processes including those operating in cancer genomes. As we aim to understand fundamental principles of chromatin-mediated effects, we analyze data from a variety of organism, including humans, bacteria, and archaea, in a comparative manner.

Second, complementing our studies of the biased origin of genetic novelty, we investigate the biased maintenance of genetic variants. Specifically, we focus on how intermolecular interactions can facilitate the maintenance of ostensibly deleterious states. This includes, for example, the capacity of chaperones to buffer the effect of mutations that destabilize protein structure and of RNA-binding proteins to prevent detrimental use of cryptic processing sites such as cryptic polyadenylation signals. Understanding how interactions can mask otherwise deleterious sequence signals is critical for identifying potential disease variants from a frequently vast pool of candidates because signals that are effectively invisible to the cell (and hence not exposed to selection) can easily be misidentified as likely causal variants underlying an observed phenotypic/disease state. Our approach here mirrors our earlier strategy of combining evolutionary sequence data with interaction data (e.g. RNA-protein interaction data from CLIP-Seq experiments) to understand how binding affects the visibility of sequence signals to the cellular machinery and thus mediates whether deleterious signals can persist over time. Currently, our main focus is on characterizing the mutation buffering effect of DEAD-box helicases, a class of RNA chaperones, in E. coli, through a combination of genetic manipulations (gene deletion and overexpression) and competitive fitness assays – conducted in collaboration with Anita Krisko from the Mediterranean Institute for Life Sciences in Split – RNA structural modelling and CLIP-Seq experiments.
Description Imperial College Junior Research Fellowship
Amount £150,000 (GBP)
Organisation Imperial College London 
Sector Academic/University
Country United Kingdom of Great Britain & Northern Ireland (UK)
Start 10/2014 
End 09/2017
Description Integrative Experimental and Computational Biology Studentship
Amount £69,125 (GBP)
Organisation Imperial College London 
Sector Academic/University
Country United Kingdom of Great Britain & Northern Ireland (UK)
Start 10/2014 
End 03/2018
Description Mutation buffering in RNA 
Organisation Mediterranean Institute for Life Sciences (MedILS)
Country Croatia, Republic of 
Sector Charity/Non Profit 
PI Contribution Sequencing of E. coli mutator strain and bioinformatic analysis of mutations
Collaborator Contribution Conducted various experiments to characterize the phenotypic impact of mutations found in two mutator strains, including fitness assays.
Impact Manuscript currently in revision at eLife
Start Year 2013