Applying machine learning models to genome data to understand the evolution of drug resistance from virus to cancer evolution

Lead Research Organisation: University of Glasgow
Department Name: College of Medical, Veterinary, Life Sci

Abstract

Studentship strategic priority area: Mathematics, statistics and computation
Keywords: Virus evolution, cancer evolution, machine learning, genomics

Treatment of cancer and chronic infectious diseases often fail due to the evolution of resistance to therapy. Underpinning this phenomena is the generation of changes (mutations) in their genetic material. Mutations generate high levels of differences in the genomes of cancer cells or intra-patient virus populations that leads to their ability to evolve in response to drugs. Recent advances in genome sequencing have revealed genomic alterations that drive cancer progression and pathogen infection. These data give insight into the diseases' underlying evolutionary dynamics which follow predictions of both Darwin's theory of evolution and Motoo Kimura's theory of molecular evolution. Yet, how evolutionary dynamics interact with mutational processes, and whether these processes can predict clinical outcome is largely unknown. Due to the variety and complexity of genomic alterations observed across human, cancer and virus evolution, unified mathematical equations of evolution are often intractable. We propose to leverage state of the art machine learning methods applied to large scale genome sequencing data sets to build biologically informed data-driven models of evolutionary dynamics. These models permit efficient data analysis that account for the variety and complexity of genomic alterations observed across human, cancer and virus evolution. They will infer the life histories of disease processes and predict disease progression and effects of interventions. Early prediction of resistance to therapies is essential to maximising the potency of interventions and switching treatments when necessary. The student will be trained in a combination of data science and bioinformatics, with substantial elements of computation, programming and statistics/machine learning.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
MR/N013166/1 01/10/2016 30/09/2025
2453134 Studentship MR/N013166/1 16/09/2020 15/03/2024 Kieran Lamb