A Multi-Scale Approach to Classifying DNA Damage Repair Deficiencies and their Therapeutic Relevance
Lead Research Organisation:
University College London
Department Name: Pathology
Abstract
DNA damage is at the centre of cancer development and evolution. Owing to uncontrolled proliferation which characterizes cancer, initial driver mutations provide enhanced growth potential, leading to additional DNA damage [1]. Consequently, cancer cells are reliant on DNA damage repair (DDR) processes to prevent accumulated damage causing catastrophic genomic instability [2]. This provides substantial therapeutic opportunities, whereby synthetically lethal treatment approaches are applied to tumours experiencing specific DDR deficiencies, inducing cancer cell death [3,4].
Signals of DDR deficiencies are conserved within the cancer genome as genomic alterations, such as signal base substitutions (SBS). These alterations are not distributed randomly across the genome but follow specific patterns which can be described as 'mutational signatures' [5]. A number of signatures have been associated with deficiencies in specific DDR processes, providing insight into a cancer's intrinsic vulnerabilities. Work performed during the PhD rotation involved calculating enrichment of known SBS signatures in primary tumour samples obtained from The Cancer Genome Atlas, followed by clustering to identify samples with similar DDR deficiency profiles.
Following patient stratification, we analysed expression profiles to shed light on the DDR processes driving signature formation. We developed a random forest classifier using RNA-seq data for genes with known DDR functionality to classify samples based on DDR deficiencies. Mismatch repair (MMR) deficiency was predicted with high accuracy, and MLH1 was found to have considerable importance in MMR classification. Additionally, SPO11 and PRKCG contributed significantly to proper classification, despite no known MMR role. Further research is required in order to elucidate the functional pathways triggering the formation of these signatures.
The aim of the PhD will be to expand this work by performing a multi-scale integration of mutational, expression and tumour architecture signals to infer the underlying mechanisms of DDR deficiency in cancer and identify new therapeutic opportunities. To achieve this, we will consider additional genomic features, such as copy number variants [6] and indels [7], and experiment with alternative clustering and dimensionality reduction techniques for patient stratification.
Another method for enhancing these findings will be to apply executable models. These are computational state-transition networks that model biological systems and can be applied to efficiently study the effects of perturbations, such as mutations, within the context of an overall system, such as cancer [8]. Core networks can be formed using singlecell gene expression data by converting profiles into a binary format and identifying logical changes that drive movement between various states [9,10]. By applying these techniques to single-cell expression profiles of cancer samples, grouped by DDR deficiencies, we aim to identify genes and networks whose expression drives movement between these
groups, and can therefore be functionally implicated in DDR.
This project will involve integration of multi-omics data from bulk and single cell pancancer cohorts available publicly and from collaborators, in order to identify evidence of DDR deficiencies and shed light on the molecular processes driving them. Ultimately, these analyses will help in identifying cellular vulnerabilities that can be exploited for treatment selection, improved treatment efficacy, and biomarker identification.
Signals of DDR deficiencies are conserved within the cancer genome as genomic alterations, such as signal base substitutions (SBS). These alterations are not distributed randomly across the genome but follow specific patterns which can be described as 'mutational signatures' [5]. A number of signatures have been associated with deficiencies in specific DDR processes, providing insight into a cancer's intrinsic vulnerabilities. Work performed during the PhD rotation involved calculating enrichment of known SBS signatures in primary tumour samples obtained from The Cancer Genome Atlas, followed by clustering to identify samples with similar DDR deficiency profiles.
Following patient stratification, we analysed expression profiles to shed light on the DDR processes driving signature formation. We developed a random forest classifier using RNA-seq data for genes with known DDR functionality to classify samples based on DDR deficiencies. Mismatch repair (MMR) deficiency was predicted with high accuracy, and MLH1 was found to have considerable importance in MMR classification. Additionally, SPO11 and PRKCG contributed significantly to proper classification, despite no known MMR role. Further research is required in order to elucidate the functional pathways triggering the formation of these signatures.
The aim of the PhD will be to expand this work by performing a multi-scale integration of mutational, expression and tumour architecture signals to infer the underlying mechanisms of DDR deficiency in cancer and identify new therapeutic opportunities. To achieve this, we will consider additional genomic features, such as copy number variants [6] and indels [7], and experiment with alternative clustering and dimensionality reduction techniques for patient stratification.
Another method for enhancing these findings will be to apply executable models. These are computational state-transition networks that model biological systems and can be applied to efficiently study the effects of perturbations, such as mutations, within the context of an overall system, such as cancer [8]. Core networks can be formed using singlecell gene expression data by converting profiles into a binary format and identifying logical changes that drive movement between various states [9,10]. By applying these techniques to single-cell expression profiles of cancer samples, grouped by DDR deficiencies, we aim to identify genes and networks whose expression drives movement between these
groups, and can therefore be functionally implicated in DDR.
This project will involve integration of multi-omics data from bulk and single cell pancancer cohorts available publicly and from collaborators, in order to identify evidence of DDR deficiencies and shed light on the molecular processes driving them. Ultimately, these analyses will help in identifying cellular vulnerabilities that can be exploited for treatment selection, improved treatment efficacy, and biomarker identification.
Organisations
People |
ORCID iD |
Jasmin Fisher (Primary Supervisor) |
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
MR/N013867/1 | 30/09/2016 | 29/09/2025 | |||
2227532 | Studentship | MR/N013867/1 | 30/09/2019 | 26/11/2023 |