The Automatic BioData Scientist: Neural approaches for stoch astic disease modelling

Lead Research Organisation: University of Birmingham
Department Name: Cancer Sciences

Abstract

The Automatic BioData Scientist: Neural approaches for stoch astic disease modelling

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R513167/1 01/10/2018 30/09/2023
2278917 Studentship EP/R513167/1 30/09/2019 29/09/2022 Dominic Danks
 
Description The aim of this project at its outset was to investigate and develop statistical and machine learning methods for disease progression modelling in a broad sense. This has been realised by undertaking three methodologically related pieces of work.

The first of these is published as "BasisDeVAE: Interpretable Simultaneous Dimensionality Reduction and Feature-Level Clustering with Derivative-Based Variational Autoencoders" in the Proceedings of the 38th International Conference on Machine Learning (ICML 2021). This work proposes a method relevant to the setting in which one is provided with cross-sectional data (i.e. measurements without associated time-values) from some temporal process and wishes to order the observations, in turn understanding the dynamics of the process. In a disease modelling context, this could correspond to examining biomarker measurements from a large number of patients and aiming to learn how the biomarkers change as the disease progresses. In many cases, biomarkers will behave in one of a few ways, e.g. increase over time, decrease over time, or increase and then decrease over a certain time window. Our method targets this particular situation and represents a way to both learn the overall dynamics of the process and to group biomarkers into their appropriate behaviour class, thus assisting practitioners when interpreting results, particularly when the cross-sectional data is high-dimensional, e.g. in a single-cell sequencing setting.

The second block of work is due to be published as "Derivative-Based Neural Modelling of Cumulative Distribution Functions for Survival Analysis" at The 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022). This work introduces a new method for survival analysis, that is the prediction of survival time of a patient given information about their current health status. Historically, most survival analysis methods assume that the patients under consideration are only at risk due to a single disease or cause of death. However, our method is also applicable when the patients considered are exposed to multiple "competing" risks and does not rely on assumptions (such as proportional hazards) as many other models do.

The third is ongoing work associated with the OPTIMAL (https://fundingawards.nihr.ac.uk/award/NIHR202632) project within which we are developing machine learning methodology capable of predicting diseases progression in the presence of multiple diseases as well as understanding how treatments interact with this progression.
Exploitation Route The work published and due to be published is now in a state to be applied and developed by those working in the field of health data science, facilitated by open-sourced code. Valuable directions include applying the methods to more diverse datasets and developing the methodology via incorporation of ongoing efforts within deep learning (e.g. improved optimisation, neural network architectures or learning paradigms).
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology

 
Title BasisDeVAE 
Description Computational implementation of the method described in the publication "BasisDeVAE: Interpretable Simultaneous Dimensionality Reduction and Feature-Level Clustering with Derivative-Based Variational Autoencoders" (published at ICML 2021, https://proceedings.mlr.press/v139/danks21a.html). 
Type Of Material Computer model/algorithm 
Year Produced 2021 
Provided To Others? Yes  
Impact The associated work was presented at ICML 2021, one of the most important conference in the field of machine learning. 
URL https://github.com/djdanks/BasisDeVAE
 
Description PSA Density-based Joint Longitudinal-Survival Analysis 
Organisation University College London
Country United Kingdom 
Sector Academic/University 
PI Contribution I contributed to the statistical analysis of Prostate Cancer Active Surveillance data for the purposes of publication, leading to successful publication in the Prostate Cancer and Prostatic Diseases journal.
Collaborator Contribution Partners at UCL collected the data and defined the scope of the work.
Impact Publication in Prostate Cancer and Prostatic Diseases: Mapping PSA density to outcome of MRI-based active surveillance for prostate cancer through joint longitudinal-survival models. https://doi.org/10.1038/s41391-021-00373-w Abstract at European Urology Conference: PSA density and clinical outcome in MRI-based active surveillance for prostate cancer: A joint longitudinal-survival analysis
Start Year 2020