Transfer learning of pharmacogenomic information across disease types and preclinical models for drug sensitivity prediction.

Lead Research Organisation: University of Sheffield
Department Name: Neurosciences

Abstract

The failure rate for new drugs entering clinics is in excess of 90%, with more than a quarter of drugs failing due to lack of efficacy. Earlier treatment decisions for complex diseases like lung cancer considered a small number of patient factors and prescribed a fixed treatment regimen for all patients, resulting in severe drug side effects for some and highly-varying outcomes. Recently, personalised treatments have become popular through the discovery and use of genetic markers that can explain a patient's response to a drug. If the goal of personalised medicine is to give the right drug to the right patient, we may be able to combine pharmacogenomics with machine learning to help make better treatment decisions.

Due to the potential waste of testing ineffective drugs on patient cells and animal models in the laboratory, we are motivated to leverage the power of machine learning to predict drug response from a limited number of experiments. We and many others in drug development have used computational methods to learn from drug responses measured in vitro and provide evidence for clinical trials, however, existing machine learning methods do poorly at predicting drug response in disease types where we have a limited number of samples. This situation unfortunately happens quite often for rare cancers and other diseases like motor neurone disease (also known as ALS), because there are few patients or their samples are difficult to collect. Overcoming this limitation by extending machine learning to learn from different disease contexts would mean that we can reduce the time-consuming step of gathering biological resources and then accelerate drug development.

In this project, we will develop machine learning algorithms that will take into account all of the dose-response data we have for each drug tested in only a few samples. To overcome the issue of few training cases in a disease, we will develop a transfer learning framework that will use knowledge from other diseases with more drug response data to address the problem in the disease with less data. The algorithms will be developed and tested in five stages: 1) develop a learning model that maps genomic information to drug response in both the disease with more data and the disease with limited data; 2) develop an inference model for predicting drug response in the disease with limited data; 3) apply the learning and inference models to use genomic relationships to drug sensitivity in lung cancer to predict drug response in bladder cancer; 4) learn from drug responses in cell lines and predict response in mice tumour models; 5) learn and predict biomarkers that describe a particular drug's sensitivity in both lung cancer and motor neurone disease. Genomic information will be used as inputs for the prediction algorithms because they can be reliably measured in the laboratory and in the clinic. We use prediction test cases of increasing difficulty, but successes in transferring pharmacogenomics information between diseases will highlight opportunities for scientists to leverage existing data sets to solve challenges of testing a drug in a new disease.

We are conducting this interdisciplinary study as a team of computer scientists, clinicians and cell biologists with expertise in machine learning, cancer and neuroscience. The end goal is to eventually develop a suite of software tools that can be readily used flexibly by the drug development community to apply transfer learning to many different problems.

Publications

10 25 50
 
Description Scientific Committee for Multiple Long Term Conditions (NIHR and MRC)
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
Impact Launched a unit within the Turing Institute to support data aggregation and training initiatives to encourage researchers to share data on treating multiple diseases. Research groups highlighted difficulties with drug coding, which my group and Turing supported.
 
Description 100,000 Genome Project 
Organisation Genomics England
Country United Kingdom 
Sector Public 
PI Contribution I have been a research consultant and a member of the Genomics England Clinical Interpretation Partnership (GeCIP) for the neurology, cancer and bioinformatics domains. My team has helped to assess the quality of variant identification by Genomics England in the 100,000 Genome Project.
Collaborator Contribution Genomics England has provided >100,000 clinical grade whole genomes across various diseases and a computing platform (Research Environment) to enable us to conduct analysis.
Impact We have analysed systematic sequencing biases in clinical whole genomes that would affect variants used for genetic diagnosis. This has been published in Freeman et al. https://genome.cshlp.org/content/early/2020/03/10/gr.255349.119
Start Year 2019
 
Description Genotype of Urothelial cancer: Stratified Treatment and Oncological outcomes (GUSTO): Phase II study. 
Organisation Leeds Teaching Hospitals NHS Trust
Country United Kingdom 
Sector Public 
PI Contribution Bioinformatics lead for a Phase II clinical trial using genomic characterisation of bladder cancer to determine treatment. I advised on the trial design and worked with the commercial company doing algorithmic diagnosis.
Collaborator Contribution My partners launched the clinical trial from Shieffleld and Leeds Teaching Hospitals. They executed the recruitment of study participants, governance, ethics, etc.
Impact Contracts and collaboration agreements signed with AstraZeneca and Veracyte for drugs and diagnostics.
Start Year 2021
 
Description Genotype of Urothelial cancer: Stratified Treatment and Oncological outcomes (GUSTO): Phase II study. 
Organisation Sheffield Teaching Hospital
Country United Kingdom 
Sector Hospitals 
PI Contribution Bioinformatics lead for a Phase II clinical trial using genomic characterisation of bladder cancer to determine treatment. I advised on the trial design and worked with the commercial company doing algorithmic diagnosis.
Collaborator Contribution My partners launched the clinical trial from Shieffleld and Leeds Teaching Hospitals. They executed the recruitment of study participants, governance, ethics, etc.
Impact Contracts and collaboration agreements signed with AstraZeneca and Veracyte for drugs and diagnostics.
Start Year 2021