Bioinformatics and Regulatory Genomics

Lead Research Organisation: University of Edinburgh

Abstract

The most basic level of the genomic landscape in human cells is made up of DNA wound around repeating arrays of nucleosome proteins like beads on a string. These arrays are sequentially compressed and packaged into higher order structures up to the massively compacted structure of the chromosome. The various layers of packaging, or chromatin structure, vary dynamically across the genome to regulate critical functions such as replication and the transcriptional activity of genes. Consequently chromatin structure plays an important part in human development and in diseases such as cancers. Recent developments in high throughput sequencing have led to huge volumes of new data and we study these data computationally to learn more about chromatin structure and the complex cellular networks regulating genes. We want to discover how these structures and networks relate to complex genetic traits, and how they change during disease processes. We also want to understand their roles in the evolutionary history of humans and other mammals.

Technical Summary

Recent advances in high throughput sequencing efficiency have begun to revolutionize our view of vertebrate genome structure. It is now possible to probe the intricacies of chromatin structure and function across the genome at high resolution, and study their dynamics across development, disease and evolutionary time. We intend to strengthen our position at the intersections between comparative genomics, transcriptomics and epigenomics, exploiting the flood of new sequence data from human and model organism genomes to investigate vertebrate evolution, development and human disease. We want to understand the impact of chromatin structure on the functional repertoire and evolution of the vertebrate genome. We aim to develop novel computational approaches to high throughput data analysis, and to provide new insights into transcriptional regulation. We will continue our established studies of promoter activities and evolution, and of chromatin structure and mutation rate. We have extended our work into novel areas such as divergence in higher order chromatin structure and the relationships between chromatin structure and complex trait loci. We are well placed to test hypotheses generated by our investigations, through established and new collaborations with experimental scientists, both within and outside the HGU and IGMM. We have three specific research objectives. (1) We will extend our studies of promoter function and evolution, to novel analyses of promoter activity dynamics between cell types and species. We hope to discover new constellations of loci under common regulatory influences, and to define regulatory landscapes playing important roles during evolution or disease. (2) We will explore the divergence of chromatin structure across cell types and over mammalian evolution. We will continue our work examining the interplay between structural and underlying sequence level divergence, to shed light on the variability in mutation rates found across the vertebrate genome. In parallel we will investigate the relationships between the panoply of lower level chromatin variants (histone modifications and variants, DNA methylation etc) and higher levels of structural organization (chromatin domains, replication timing and nuclear lamina association). (3) We will probe the relationships between chromatin structure and the genetic architecture of complex traits. We will conduct searches for heritable human sequence variants associated with allele-specific chromatin structure and seek to validate their effects on function. We will also interrogate the results of GWAS to identify examples of disease risk mediated by chromatin structure.
 
Description CSO/MRC/Scottish Enterprise joint funding of Scottish Genomes Partnership (Co-PI)
Amount £9,500,000 (GBP)
Organisation Chief Scientist Office 
Sector Public
Country United Kingdom
Start 03/2016 
End 12/2019
 
Description MRC Edinburgh-St Andrews Molecular Pathology Node (Co-PI)
Amount £2,100,000 (GBP)
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 10/2015 
End 10/2018
 
Description NHS/MRC IGMM Translational Fund/Wellcome Trust ISSF joint funding for translational bioinformatics post (PI)
Amount £150,000 (GBP)
Organisation Wellcome Trust 
Department Wellcome Trust Institutional Strategic Support Fund
Sector Charity/Non Profit
Country United Kingdom
Start 04/2016 
End 04/2019
 
Description AZ/CSO HGSOC project 
Organisation AstraZeneca
Country United Kingdom 
Sector Private 
PI Contribution I am informatics lead on this project and my group provides storage, processing and computational analyses of tumour sequencing data (WGS, RNA-seq) for high grade serous ovarian cancer, using MRC IGMM/HGU computing infrastructure. We also provide intellectual input in experimental design, statistics and manuscript writing.
Collaborator Contribution Provision of high grade serous ovarian cancer samples, generation of raw sequencing data, management/supervision, manuscript writing etc.
Impact This is a multi-disciplinary collaboration between bioinformaticists, experimental biologists and clinicians
Start Year 2016
 
Description AZ/CSO HGSOC project 
Organisation Cancer Research UK
Department Edinburgh Cancer Research UK Centre
Country United Kingdom 
Sector Academic/University 
PI Contribution I am informatics lead on this project and my group provides storage, processing and computational analyses of tumour sequencing data (WGS, RNA-seq) for high grade serous ovarian cancer, using MRC IGMM/HGU computing infrastructure. We also provide intellectual input in experimental design, statistics and manuscript writing.
Collaborator Contribution Provision of high grade serous ovarian cancer samples, generation of raw sequencing data, management/supervision, manuscript writing etc.
Impact This is a multi-disciplinary collaboration between bioinformaticists, experimental biologists and clinicians
Start Year 2016
 
Description FANTOM6 
Organisation RIKEN
Country Japan 
Sector Public 
PI Contribution Analysis of RNA sequencing data
Collaborator Contribution Production of RNA sequencing data
Impact Multidisciplinary: molecular biology, bioinformatics
Start Year 2015
 
Description ICGC Pan-cancer analysis of whole genomes (PCAWG) 
Organisation Ontario Institute for Cancer Research (OICR)
Country Canada 
Sector Academic/University 
PI Contribution We participated in the ICGC pan-cancer analysis of whole genomes (PCAWG) consortium, contributing novel meta-analyses of cancer mutation data. This was led primarily from the WT Sanger Institute and OICR and was a large collaboration (the main paper has ~1400 co-authors). Although very few collaborators gained anything financially from the collaboration the datasets produced will be the 'gold standard' in cancer genomics for many years to come.
Collaborator Contribution The PCAWG consortium provides central management and access to cancer mutation and expression data.
Impact Some manuscripts are in review, but preprints are available. The main paper was published recently in February 2020 in Nature: PMID: 32025007. The work is inherently multi-disciplinary, involving bioinformaticians, computer scientists, cancer biologists and clinicians.
Start Year 2016
 
Description ICGC Pan-cancer analysis of whole genomes (PCAWG) 
Organisation The Wellcome Trust Sanger Institute
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution We participated in the ICGC pan-cancer analysis of whole genomes (PCAWG) consortium, contributing novel meta-analyses of cancer mutation data. This was led primarily from the WT Sanger Institute and OICR and was a large collaboration (the main paper has ~1400 co-authors). Although very few collaborators gained anything financially from the collaboration the datasets produced will be the 'gold standard' in cancer genomics for many years to come.
Collaborator Contribution The PCAWG consortium provides central management and access to cancer mutation and expression data.
Impact Some manuscripts are in review, but preprints are available. The main paper was published recently in February 2020 in Nature: PMID: 32025007. The work is inherently multi-disciplinary, involving bioinformaticians, computer scientists, cancer biologists and clinicians.
Start Year 2016
 
Description Liver Cancer Evolution Consortium 
Organisation EMBL European Bioinformatics Institute (EMBL - EBI)
Country United Kingdom 
Sector Academic/University 
PI Contribution Computational analysis of genomic, epigenomic and transcriptomic data
Collaborator Contribution Generation and analysis of genomic, epigenomic and transcriptomic data; pathology; funding; supervision/management
Impact This collaboration is multidisciplinary, involving experimental and computational biologists as well as pathologists.
Start Year 2017
 
Description Liver Cancer Evolution Consortium 
Organisation German Cancer Research Center
Country Germany 
Sector Academic/University 
PI Contribution Computational analysis of genomic, epigenomic and transcriptomic data
Collaborator Contribution Generation and analysis of genomic, epigenomic and transcriptomic data; pathology; funding; supervision/management
Impact This collaboration is multidisciplinary, involving experimental and computational biologists as well as pathologists.
Start Year 2017
 
Description Liver Cancer Evolution Consortium 
Organisation Institute for Research in Biomedicine (IRB)
Country Spain 
Sector Academic/University 
PI Contribution Computational analysis of genomic, epigenomic and transcriptomic data
Collaborator Contribution Generation and analysis of genomic, epigenomic and transcriptomic data; pathology; funding; supervision/management
Impact This collaboration is multidisciplinary, involving experimental and computational biologists as well as pathologists.
Start Year 2017
 
Description Liver Cancer Evolution Consortium 
Organisation University of Cambridge
Country United Kingdom 
Sector Academic/University 
PI Contribution Computational analysis of genomic, epigenomic and transcriptomic data
Collaborator Contribution Generation and analysis of genomic, epigenomic and transcriptomic data; pathology; funding; supervision/management
Impact This collaboration is multidisciplinary, involving experimental and computational biologists as well as pathologists.
Start Year 2017