Medical and Regulatory Genomics
Lead Research Organisation:
University of Edinburgh
Department Name: UNLISTED
Abstract
The genes embedded in your genome have complex patterns of activity and particular constellations of genes must be active in particular cells and at particular times for biological processes, such as embryonic development, to conclude successfully. Our group is interested in the fundamental biology of gene regulation: when, how and why genes are turned on and off. We advance understand of the mechanisms underlying gene regulation, using computational analyses of datasets measuring the activities of many thousands of genes. This can provide new insights into human evolution itself, and also helps us to interpret disease processes with disrupted regulation, such as cancers and developmental disorders. Human genomes vary at millions of DNA sites between individuals, but often we do not know which variants matter most to our biology. Ultimately we want to develop predictive models, based upon our knowledge of gene regulation, that help us to forecast the effects of variants in health and disease.
Technical Summary
We use computational approaches to reveal the interdependencies between transcription, chromatin and the underlying genomic sequence - and to investigate transcriptional control mechanisms and their disruption in disease. We have an established track record in studying the roles of chromatin structure in mutational spectra, and in how tumour genomes evolve during cancer progression. We aim to test hypotheses addressing three broad aims.
Our first aim is to reveal new functional interactions between chromatin structure and transcriptional activity. We still lack a convincing and quantitative account of the relationships between nuclear organisation, chromatin structure and expression dynamics over time. Developing such models will have important consequences for our understanding of noncoding variation in disease.
Our second aim is to understand the roles of chromatin structure and mutational bias in the evolution of regulatory sites, within the catastrophe strewn genomes of tumours, and during the evolution of isolated human populations. Using a novel approach we have demonstrated unexpectedly high mutational loads at active regulatory sites in most cancers, relative to matched control sites. We are extending this approach to allow us to test the hypothesis that certain site classes accumulate lower levels of mutation than expected, suggesting purifying selection in tumours. Classes showing unexpected increases or decreases of mutational load, suggesting selection in specific tumour types, may suggest new candidate targets for therapeutics. In parallel we will extend our regulatory mutational load metrics to comparisons between isolated human populations (eg from the Shetland archipelago) and cosmopolitan populations, to detect regulatory site classes with unexpected loads, correlated to phenotypic traits. We aim to gain novel insights into the impact of rare regulatory variants on quantitative traits relevant to disease predisposition, ultimately aiding the development of ‘personalised’ medicine based upon genome sequencing.
Finally, we aim to explore the frequencies and roles of regulatory domain lesions in developmental disorders and cancers. Structural variants (SVs) are known to play critical roles in tumourigenesis, but there are currently no methods to reliably disentangle mutational bias from selection operating on SVs in cancers. Such methods are essential to establish which SVs host potential new therapeutic targets, and which are simply well tolerated by tumours. We are developing new models based upon experimental measurements of the mutational spectra in relevant cell types, to estimate the expected spectra in tumours, and rigorously infer candidate breakpoints that may be under selection.
Our first aim is to reveal new functional interactions between chromatin structure and transcriptional activity. We still lack a convincing and quantitative account of the relationships between nuclear organisation, chromatin structure and expression dynamics over time. Developing such models will have important consequences for our understanding of noncoding variation in disease.
Our second aim is to understand the roles of chromatin structure and mutational bias in the evolution of regulatory sites, within the catastrophe strewn genomes of tumours, and during the evolution of isolated human populations. Using a novel approach we have demonstrated unexpectedly high mutational loads at active regulatory sites in most cancers, relative to matched control sites. We are extending this approach to allow us to test the hypothesis that certain site classes accumulate lower levels of mutation than expected, suggesting purifying selection in tumours. Classes showing unexpected increases or decreases of mutational load, suggesting selection in specific tumour types, may suggest new candidate targets for therapeutics. In parallel we will extend our regulatory mutational load metrics to comparisons between isolated human populations (eg from the Shetland archipelago) and cosmopolitan populations, to detect regulatory site classes with unexpected loads, correlated to phenotypic traits. We aim to gain novel insights into the impact of rare regulatory variants on quantitative traits relevant to disease predisposition, ultimately aiding the development of ‘personalised’ medicine based upon genome sequencing.
Finally, we aim to explore the frequencies and roles of regulatory domain lesions in developmental disorders and cancers. Structural variants (SVs) are known to play critical roles in tumourigenesis, but there are currently no methods to reliably disentangle mutational bias from selection operating on SVs in cancers. Such methods are essential to establish which SVs host potential new therapeutic targets, and which are simply well tolerated by tumours. We are developing new models based upon experimental measurements of the mutational spectra in relevant cell types, to estimate the expected spectra in tumours, and rigorously infer candidate breakpoints that may be under selection.
Organisations
- University of Edinburgh (Lead Research Organisation)
- Cancer Research UK (Collaboration)
- AstraZeneca (Collaboration)
- EMBL European Bioinformatics Institute (EMBL - EBI) (Collaboration)
- RIKEN (Collaboration)
- German Cancer Research Center (Collaboration)
- Ontario Institute for Cancer Research (OICR) (Collaboration)
- NHS LOTHIAN (Collaboration)
- The Wellcome Trust Sanger Institute (Collaboration)
- UNIVERSITY OF CAMBRIDGE (Collaboration)
- Institute for Research in Biomedicine (IRB) (Collaboration)
People |
ORCID iD |
Colin Semple (Principal Investigator) |
Publications

Aitken S
(2019)
Pervasive lesion segregation shapes cancer genome evolution


Aitken S
(2020)
Pervasive lesion segregation shapes cancer genome evolution


Aitken SJ
(2020)
Pervasive lesion segregation shapes cancer genome evolution.
in Nature

Anderson C
(2024)
Strand-resolved mutagenicity of DNA damage and repair
in Nature

Ballinger TJ
(2019)
Modeling double strand break susceptibility to interrogate structural variation in cancer.
in Genome biology

Bolado-Carrancio A
(2021)
ISGylation drives basal breast tumour progression by promoting EGFR recycling and Akt signalling.
in Oncogene

Bonneau M
(2021)
Functional brain defects in a mouse model of a chromosomal t(1;11) translocation that disrupts DISC1 and confers increased risk of psychiatric illness.
in Translational psychiatry

Ewing A
(2018)
Breaking point: the genesis and impact of structural variation in tumours.
in F1000Research
Description | Member of committee and writing group for SSAC report for Scottish Government on 'The Future of Genomics in Scotland' |
Geographic Reach | National |
Policy Influence Type | Participation in a guidance/advisory committee |
URL | https://www.scottishscience.org.uk/sites/default/files/article-attachments/Genomics%20Full%20Report.... |
Description | CSO/MRC/Scottish Enterprise joint funding of Scottish Genomes Partnership (Co-PI) |
Amount | £9,500,000 (GBP) |
Organisation | Chief Scientist Office |
Sector | Public |
Country | United Kingdom |
Start | 03/2016 |
End | 12/2019 |
Description | Defining the genetic and transcriptomic landscape of canine oral melanoma |
Amount | £115,000 (GBP) |
Organisation | The Kennel Club Charitable Trust |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 08/2021 |
End | 09/2024 |
Description | Genomic drivers and novel treatment strategies in low grade serous ovarian cancer |
Amount | £190,000 (GBP) |
Organisation | Target Ovarian Cancer |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 03/2019 |
End | 02/2021 |
Description | Precision medicine and the mutational landscape of high grade serous ovarian cancer |
Amount | £258,756 (GBP) |
Funding ID | MR/R026017/1 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 02/2018 |
End | 01/2021 |
Description | AZ/CSO HGSOC project |
Organisation | AstraZeneca |
Country | United Kingdom |
Sector | Private |
PI Contribution | I am informatics lead on this project and my group provides storage, processing and computational analyses of tumour sequencing data (WGS, RNA-seq) for high grade serous ovarian cancer, using MRC IGMM/HGU computing infrastructure. We also provide intellectual input in experimental design, statistics and manuscript writing. |
Collaborator Contribution | Provision of high grade serous ovarian cancer samples, generation of raw sequencing data, management/supervision, manuscript writing etc. |
Impact | This is a multi-disciplinary collaboration between bioinformaticists, experimental biologists and clinicians |
Start Year | 2016 |
Description | AZ/CSO HGSOC project |
Organisation | Cancer Research UK |
Department | Edinburgh Cancer Research UK Centre |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | I am informatics lead on this project and my group provides storage, processing and computational analyses of tumour sequencing data (WGS, RNA-seq) for high grade serous ovarian cancer, using MRC IGMM/HGU computing infrastructure. We also provide intellectual input in experimental design, statistics and manuscript writing. |
Collaborator Contribution | Provision of high grade serous ovarian cancer samples, generation of raw sequencing data, management/supervision, manuscript writing etc. |
Impact | This is a multi-disciplinary collaboration between bioinformaticists, experimental biologists and clinicians |
Start Year | 2016 |
Description | FANTOM6 |
Organisation | RIKEN |
Country | Japan |
Sector | Public |
PI Contribution | Analysis of RNA sequencing data |
Collaborator Contribution | Production of RNA sequencing data |
Impact | Multidisciplinary: molecular biology, bioinformatics |
Start Year | 2015 |
Description | ICGC Pan-cancer analysis of whole genomes (PCAWG) |
Organisation | Ontario Institute for Cancer Research (OICR) |
Country | Canada |
Sector | Academic/University |
PI Contribution | We participated in the ICGC pan-cancer analysis of whole genomes (PCAWG) consortium, contributing novel meta-analyses of cancer mutation data. This was led primarily from the WT Sanger Institute and OICR and was a large collaboration (the main paper has ~1400 co-authors). Although very few collaborators gained anything financially from the collaboration the datasets produced will be the 'gold standard' in cancer genomics for many years to come. |
Collaborator Contribution | The PCAWG consortium provides central management and access to cancer mutation and expression data. |
Impact | Some manuscripts are in review, but preprints are available. The main paper was published recently in February 2020 in Nature: PMID: 32025007. The work is inherently multi-disciplinary, involving bioinformaticians, computer scientists, cancer biologists and clinicians. |
Start Year | 2016 |
Description | ICGC Pan-cancer analysis of whole genomes (PCAWG) |
Organisation | The Wellcome Trust Sanger Institute |
Country | United Kingdom |
Sector | Charity/Non Profit |
PI Contribution | We participated in the ICGC pan-cancer analysis of whole genomes (PCAWG) consortium, contributing novel meta-analyses of cancer mutation data. This was led primarily from the WT Sanger Institute and OICR and was a large collaboration (the main paper has ~1400 co-authors). Although very few collaborators gained anything financially from the collaboration the datasets produced will be the 'gold standard' in cancer genomics for many years to come. |
Collaborator Contribution | The PCAWG consortium provides central management and access to cancer mutation and expression data. |
Impact | Some manuscripts are in review, but preprints are available. The main paper was published recently in February 2020 in Nature: PMID: 32025007. The work is inherently multi-disciplinary, involving bioinformaticians, computer scientists, cancer biologists and clinicians. |
Start Year | 2016 |
Description | Liver Cancer Evolution Consortium |
Organisation | EMBL European Bioinformatics Institute (EMBL - EBI) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Computational analysis of genomic, epigenomic and transcriptomic data |
Collaborator Contribution | Generation and analysis of genomic, epigenomic and transcriptomic data; pathology; funding; supervision/management |
Impact | This collaboration is multidisciplinary, involving experimental and computational biologists as well as pathologists. |
Start Year | 2017 |
Description | Liver Cancer Evolution Consortium |
Organisation | German Cancer Research Center |
Country | Germany |
Sector | Academic/University |
PI Contribution | Computational analysis of genomic, epigenomic and transcriptomic data |
Collaborator Contribution | Generation and analysis of genomic, epigenomic and transcriptomic data; pathology; funding; supervision/management |
Impact | This collaboration is multidisciplinary, involving experimental and computational biologists as well as pathologists. |
Start Year | 2017 |
Description | Liver Cancer Evolution Consortium |
Organisation | Institute for Research in Biomedicine (IRB) |
Country | Spain |
Sector | Academic/University |
PI Contribution | Computational analysis of genomic, epigenomic and transcriptomic data |
Collaborator Contribution | Generation and analysis of genomic, epigenomic and transcriptomic data; pathology; funding; supervision/management |
Impact | This collaboration is multidisciplinary, involving experimental and computational biologists as well as pathologists. |
Start Year | 2017 |
Description | Liver Cancer Evolution Consortium |
Organisation | University of Cambridge |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Computational analysis of genomic, epigenomic and transcriptomic data |
Collaborator Contribution | Generation and analysis of genomic, epigenomic and transcriptomic data; pathology; funding; supervision/management |
Impact | This collaboration is multidisciplinary, involving experimental and computational biologists as well as pathologists. |
Start Year | 2017 |
Description | NHS/MRC HGU Genomic Data Analysis Centre |
Organisation | NHS Lothian |
Country | United Kingdom |
Sector | Public |
PI Contribution | We provide bioinformatics analysis services to the NHS Lothian Molecular Genetics service - which is the main centre for genetic testing in SE Scotland. The aim is to provide candidate variants for the diagnosis of developmental disorders based upon trio whole exome sequencing data. |
Collaborator Contribution | Our NHS collaborators collect relevant samples and generate trio whole exome sequencing data. |
Impact | This collaboration is multidisciplinary and involves NHS clinical staff, NHS laboratory staff and MRC HGU bioinformatics staff. |
Start Year | 2020 |
Description | Edinburgh University Public Seminar Series: "Know your enemy: unlocking the secrets of the tumour genome" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | The talk was given in collaboration with Prof Charlie Gourley in the university 'Let's Talk About Health' public lecture series, and focused on cancer genomics. Up to 100 people attended including parties from local secondary schools. |
Year(s) Of Engagement Activity | 2019 |