Personalised Medicine through Learning in the Model Space

Lead Research Organisation: University of Birmingham
Department Name: School of Computer Science

Abstract

In order to achieve the goal of truly personalised healthcare and disease treatments tailored specifically for each individual patient, we should be able to understand why a disease appears or progresses, how does it happen, where it would happen and in how long this will happen. It is not an easy task.

Mathematics is playing an ever-increasing role in the area of health and medicine, through the use of predictive modelling, statistics, and virtual simulations. Such mathematical tools are becoming invaluable in testing the feasibility of therapeutic procedures and medical devices prior to clinical trials. Furthermore, over the coming years computer models coupled to patient-specific diagnostics will be used in real time in the clinical environment to directly advise on treatment strategies.

Given the wealth of (many times) disconnected biological, epidemiological and environmental information on a disease and adding on top of this the multiple paths that we as individuals can follow (a change in lifestyle, a geographical change, etc.) and our own individual characteristics (genes, anatomy, weight, age, etc.) it is not surprising that personalised models are difficult to achieve. There is data, information and knowledge that we must be able to connect via mathematical approaches in order to represent the mechanisms of the disease and the unique journey that we all follow. From a modeller's perspective, this is an incredible conundrum: what is important/ what is not? how do I formulate the cause-effect relationships with this disparate data if I don't understand how one risk factor or variable relates to another?

The aim of this project is to be able to 'guide' the modeller from the data and to provide personalised models for diagnosis and treatment. Starting from an already existing (partial) explanation of the disease constructed in a mechanistic mathematical way (explanation-based or hypotheses driven), the information should lead the modeller. In order to do this in a systematic way, we propose that the information will be built into so-called "data-driven" models: i.e, models that fit the data but don't explain why. These "data-driven" models are "intelligent": they learn from the data and information that they have. If these "data-driven" models could learn in the same space that the mechanistic models try to explain, there is a possible path of common understanding of these two approaches that could potentially exist. And this is the path that we intend to explore and define.

The different levels in personalised medicine that will be considered in this project are the following:
- Cell & organ level: in the context of this project, with 'cell & organ level' we mean the behavior of individual cells (cell level), the joined behavior of all cells in a tissue (tissue level) and the combined behavior of the tissues in an organ (organ level).
- patient level: with 'patient level' we mean the properties and processes of organs and patients, part of which can be observed through online monitoring, visual inspection, therapy records, etc.
- care level: with care level we mean the whole of actions of nurses and doctors, the behavior of the support systems, the applicable guidelines and policies, etc. which are external to the patient but have a significant impact on his condition.
The developed methods will allow one to perform the following prediction and inference tasks:
- Assessment of risk of a range of potential complications.
- Early warning for and diagnosis of such conditions.
- Simulation of effects of possible treatments for individual patients.

Planned Impact

This project directly addresses the strategic priority of novel treatment and therapeutic technologies by developing new patient-centred model-based predictive and diagnostic tools. It will have an impact on health and quality of life by allowing personalised healthcare for chronic health conditions. In Great Britain, around 17.5 million adults may be living with chronic disease, 45% of those will suffer from more than one condition. 80% of GP consultations relate to chronic disease and patients with chronic disease and complications use over 60% of hospital bed days. Chronic disease costs the NHS £7 for every £10 spent on patient care and the incidence of chronic disease in the over 65s is predicted to double by 2030; the potential social and financial impact of reduced medication, reduced unplanned admissions and reduced hospital stays is clear. Programmes of work on early intervention in both physical and mental health conditions and modelling processes that accurately categorise patient health states would enhance efforts to detect illness at the 'sensitive' period in the evolution of the disorder. Our system will aid clinical decision making in our increasingly overstretched healthcare services.

The potential beneficiaries of this work encompass academia, clinicians and patients; delivering substantial mathematical novelty while also being of high clinical value. This will be achieved in a highly multidisciplinary and complementary team across several institutions that specialise not only in cutting edge methodologies but also in solving real clinical problems. While we will perform the ground work in the context of two exemplar applications, the general methodology has the potential to serve a broad range of clinical (or indeed non-clinical) applications.

To achieve impact, we will exploit our local links across our institutions, enabling us to foster strong collaborations between the mathematical, machine learning and clinical communities. These collaborations are vital to ensure the translational value of this work. Clinicians will be involved throughout the whole project, from the early phases (including collection of the appropriate data), through the refinement of mechanistic models and learning methods in the model space, up to the evaluation of the overall system. Specific translational links include the University of Birmingham's links to the Queen Elizabeth Hospital and UCL's collaboration with University College Hospitals.

Publications

10 25 50

publication icon
AL Otaibi S (2016) Kernel regression estimates of time delays between gravitationally lensed fluxes in Monthly Notices of the Royal Astronomical Society

publication icon
Alimohammadi M (2017) A multiscale modelling approach to understand atherosclerosis formation: A patient-specific case study in the aortic bifurcation. in Proceedings of the Institution of Mechanical Engineers. Part H, Journal of engineering in medicine

publication icon
Alimohammadi M (2014) Evaluation of the hemodynamic effectiveness of aortic dissection treatments via virtual stenting. in The International journal of artificial organs

publication icon
Andersson R (2016) Dose-response-time modelling: Second-generation turnover model with integral feedback control. in European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences

 
Description There are different strands of work within this multi-institution project.

On the fundamental methodology front, we have developed a general framework for classifying partially observed dynamical systems. The dynamical systems of particular interest in this work are biological pathway/mechanistic models, but the framework is general. The key ingredient of this framework is to use posterior distribution over model parameters to represent the given observation set "through the lenses of the underlying dynamical system".
We developed a distributional classifier for classifying posterior distributions over the models, given the observed data. Crucially, this classifier primarily operates in the model space. We evaluated our framework with two test beds, namely a biological pathway model and stochastic double systems.
The first key finding is that our classifier clearly outperforms the classifier based on probability product kernel - a state-of-the-art method in the literature - and shows comparable performance when compared to a very recent approach based on Kernel Mean Embedding.
More importantly, we also gained in-depth understanding of the connection between these three quite different approaches. Further, we established a qualitative relation between model uncertainty and classification performance.
Finally, we investigated the scenarios in which the level of model uncertainty could shift from training phase to testing phase. This is crucial, as for example, patients in the hospital will typically have more data associated with them and consequently more certain (peaked) posterior distributions than outpatients. The framework has to consistently deal with such situations. Our results showed that our classifier
could cope with such scenarios better than the two other state-of-art classifiers.
Crucially, we found that the performance of our classifier wouldn't be impaired when the model used for inferring posterior distributions is not the same as the observation-generating models. The observation would still hold when the inferential model is a significantly reduced form of
the true underlying model. This is very important, as in practice, the available patient data is often limited and complex mechanistic models will simply be not constrained enough to allow for any reasonable inference.

The work on atherosclerosis has produced excellent results (as evidenced by the research outputs submitted). As part of the work submitted, there are key findings on haemodynamic and biochemical markers for atherosclerotic plaque location that are completely novel in the literature. These markers have been integrated into a multiscale model for plaque location, including patient-specific data, which is entirely relevant for the Sandpit theme 'Mathematics for Healthcare', which funded this project and they provide strong evidence that complex mathematical relationships amongst key variables/markers are necessary to unravel disease mechanisms. Moreover, work currently in preparation for publication and in the context of 'virtual populations' provides strong results to explain the effect of pharmacological intervention in these patients (statins). Current work in progress evaluates the effect of patient adherence to statins. This exemplar is currently being implemented in the mathematical framework of learning in the model space developed by colleagues at the University of Birmingham and Exter in order to establish whether or not a classifier built using this framework can distinguish between non-responders and non-adherent patients, a task currently impossible to achieve using available tools or knowledge and which *requires* underlying mechanistic information supplied by the model. This key finding has important implications for the NHS and for Pharma industry as this technique might be used to understand (and possibly design better) clinical trials.

Our work on learning in the model space in the context of Attention-deficit hyperactivity disorder (ADHD) data has produced further validation of the approach. ADHD is a behavioural disorder that affects 5-7% of school-aged children. Several types of medication are frequently prescribed. A clinician is often interested in the efficacy and side effects of the drugs for a given patient, in order to deliver the best treatment and minimise drop-out and maximise medication adherence. Numerous studies had tried and unsuccessfully predict treatment response and adverse drug reactions. This is due to three main reasons: 1) Despite decades of research, the underlying pathological mechanism that causes ADHD is not known in detail. 2) Information regarding the disorder and the patients are mostly in the form of subjective questionnaire ratings, clinical notes and qualitative psychometrical data; as such they are not as readily analysed by conventional data mining techniques. 3) As with the case with much clinical research, dataset available is usually small, with lots of missing data especially in the temporal domain.
These difficulties had made successful modelling of the disorder an elusive task. We used a novel learning in model space modelling paradigm to tackle the problems in predicting patient treatment response and at the same time allow one to make a foray into personalised medicine.
We can report several key findings:
1. Our systematic literature review concluded that the research evidence relating to factors associated with treatment response in childhood Attention Deficit Hyperactivity Disorder (ADHD) was inconsistent.
2. Prediction of treatment response was very challenging and to date, accurate classification of treatment responders at baseline from routine measures had not been achieved. There are currently very limited attempts in the use of machine learning methods in prediction of ADHD treatment response in literature.
We developed a probabilistic model that reflected in a principled manner uncertainty in our current knowledge.
3. Our learning in the model space approach to treatment response prediction in childhood ADHD outperformed conventional statistical approaches in classification of remission, but performed equally or slightly worse in regression (severity prediction). The approach outperformed another study which used comparable type of information on a different population.
4. Such predictions, based on symptom profile, demographic factors and other data readily available in good routine practice, are currently not accurate enough to be acceptable in clinical practice. The positive predictive value (proportion of true remission cases within cases predicted by the method to be so) is 36%. This is considerably higher compared to 26% for the best performing conventional classification methods, but yet accurate enough for clinical use. It may be that including neurobiological data (such as genetic and functional brain imaging data) may improve predictive accuracy in the future.

The work was further extended in a completely new direction, in conjunction of the previous BBSRC grant of the PI on unified probabilistic modelling of fMRI signals across a variety of spatial and temporal scales. The idea of learning in the model space was employed on two very different data types measured on the same set of subjects trained to perform a cognitive task - namely behavioural and brain imaging (fMRI) data. The subjects were trained on a hierarchy of increasingly complex temporal tasks. We gathered behavioural data, as well as fMRI signals before and after training at each level of the task complexity. Leaning in the model space framework developed within this grant enabled us to fuse in a principled manner the behavioural and fMRI data - the extracted behavioural models and probabilistic spatial-temporal fMRI models were used in conjunction to construct signatures of neuro-cortical activations distinguishing slow and fast learners.

Recently we have further improved the developed models in several important directions (manuscripts in writing-up stage). First, we no longer make the (rather strong and in general difficult to justify) assumption that the complex posterior distribution over the inferential models can be represented through a discretised distribution over a grid in the model space. The machine learning models we have developed are now capable of processing arbitrary posterior samples over the model space. Second, to enhance the model interpretability we have included a principled sparsity imposition mechanism. This makes the resulting machine learning models more transparent and easier to understand. Third, the techniques developed in the healthcare scenario were transferred to a seemingly unrelated scientific discipline - astrophysics. In particular, we have applied the ideas of learning in the model space to the difficult task of galaxy group detection by formulating machine learning directly in the space of galaxy group models. This demonstrates the generality of mathematics involved in development of such techniques.
Exploitation Route In the atherosclerosishis stand of work, the key finding may have important implications for the NHS and for Pharma industry as this technique might be used to understand (and possibly design better) clinical trials. Our recent model extensions will be able to unify in a principled way behavioural and brain imaging data collected while subjects are in the process of learning a cognitive task - thus potentially helping us to better understand the spatial temporal changes of cortical representations underpinning learning.

Learning in the model space framework that enabled us to unify in a principled way behavioural and brain imaging data collected while subjects are in the process of learning a cognitive task can provide a new powerful way of understanding the spatial-temporal changes of cortical representations underpinning learning.
Sectors Healthcare,Other

 
Description In the era of automated data-driven technologies being increasingly applied in healthcare, this project had specifically at its heart the need for transparency, explainability and mechanistic understanding of the underlying biological processes as integral parts of predictive modelling or decision making. It extended the vision of machine learning in healthcare with an option of learning from the available data but in the space where it matters - the space of mechanistic understandable models of the processes beyond the observed data. Indeed, the emphasis on interpretability and transparency in automated learning has grown immensely since then. Learning in the model space has been taken up as a research topic in various research groups around the world (e.g. Bielefeld University, Germany; Politecnico di Milano, Italy; Groningen University, Holland) and has led to applications beyond healthcare (e.g. in detection of non-standard events in Barcelona's water system). The idea is now being developed in the context of multi-morbidity within a NIHR-funded multi-institutional project. The full development of the seeds brought up by the project will require further intensive work, but is sorely needed, as an alternative and a modelling companion of deep learning approaches in healthcare.
First Year Of Impact 2017
Sector Healthcare,Other
Impact Types Policy & public services

 
Description A UK Quantitative Systems Pharmacology Network
Amount £159,456 (GBP)
Funding ID EP/N005481/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 12/2015 
End 11/2018
 
Description CoEvolFramework - Unified Framework for the Analysis of Co-evolutionary
Amount € 195,455 (EUR)
Funding ID 657027 
Organisation European Commission 
Department Horizon 2020
Sector Public
Country European Union (EU)
Start 02/2016 
End 01/2018
 
Description Cross-scale prediction of Antimicrobial Resistance: from molecules to populations
Amount £501,000 (GBP)
Funding ID EP/M027503/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2016 
End 12/2017
 
Description EPSRC Impact Acceleration Account (IAA) Impact & Knowledge Exchange Award
Amount £30,000 (GBP)
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 03/2018 
End 08/2018
 
Description LeSoDyMAS - Learning in the Space of Dynamical Models of Adre nal Steroidogenesis.
Amount € 185,455 (EUR)
Funding ID 659104 
Organisation European Commission 
Department Horizon 2020
Sector Public
Country European Union (EU)
Start 07/2015 
End 06/2017
 
Description MRC Clinical Research Training Fellowship
Amount £100,000 (GBP)
Funding ID MR/R017913/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 09/2018 
End 08/2020
 
Description Marie Sklodowska-Curie Innovative Training Networks
Amount € 3,700,000 (EUR)
Organisation European Commission 
Department Horizon 2020
Sector Public
Country European Union (EU)
Start 05/2017 
End 04/2021
 
Description Marie Skodowska-Curie Individual Fellowships
Amount € 195,455 (EUR)
Funding ID 657027 
Organisation European Commission 
Department Horizon 2020
Sector Public
Country European Union (EU)
Start 02/2016 
End 01/2018
 
Description ProMoS Probabilistic Models in Pseudo-Euclidean Spaces.
Amount € 221,606 (EUR)
Funding ID 327791 
Organisation European Commission 
Department Seventh Framework Programme (FP7)
Sector Public
Country European Union (EU)
Start 01/2014 
End 12/2015
 
Title Detailed mechanistic model of atherosclerosis 
Description Several mathematical models were developed through a refinement process together with the groups at King's College, the University of Birmingham and Exeter. A research protocol was created and ethics approval granted to gather real data to test our models; however, data collection was delayed due to relocation of researchers assigned to this task. In order to address this, 2 'complex', mechanistic, multiscale models of plaque and the effect of drugs (in this case statins) on plaque development were developed to provide 'synthetic data' to test another, reduced model, still representing key biological mechanisms and the effect of the drug on plaque development. This model was then introduced as part of the mathematical framework (i.e, 'classifier learning in the model space'). The modelling approach uses different mathematical modelling paradigms in order to describe the relevant dynamic behaviours observed in the biological/physiological data and clinical trials. A combination of continuous (e.g. differential equations) with discrete event models (e.g. Markov chains) allows to simulate the pharmacokinetics of statins, its effect on the dynamics of lipoproteins (e.g. LDL) and the inflammatory pathway simultaneously exploring the effect of flow-related variables (e.g. wall shear stress) on atherosclerosis progression under different patient scenarios where it is possible to analyse the effect of medication adherence. The combination of both, the mechanistic model and the classifier will address the 'task' of separating non-responders to statins vs non-adherence patients. This relies heavily on this novel type of classifier to be able to 'gather' underlying mechanistic information from the mathematical model. A further (final) model will be tuned in the last stage of the project with the data collected from the protocol designed by UCL in order to complete the same task using real data instead in a small cohort of patients. 
Type Of Material Model of mechanisms or symptoms - human 
Provided To Others? No  
Impact The tool is still being extended and refined. After the development process is fully finished, we will make it publicly available. 
 
Description Classification in Riemannian model space 
Organisation Chinese Academy of Sciences
Country China 
Sector Public 
PI Contribution We initiated the project by outlining the need and basic structure of prototype based classification in complex model spaces that have Riemannian structure.
Collaborator Contribution Our partner (team led by Dr Fengzhen Tang, Shenyang Institution of Automation, CAS) has provided detailed derivation of the model and run ints verification on synthetic and real data.
Impact First accepted journal paper at IEEE Transactions on Neural Networks and Learning Systems - F. Tang, F. Mengling, P. Tino: "Generalized Learning Riemannian Space Quantization: a Case Study on Riemannian Manifold of SPD Matrices"
Start Year 2018
 
Description New project funded by NWO (Netherlands Organisation for Scientific Research) - direct follow-up on this project. 
Organisation University of Groningen
Country Netherlands 
Sector Academic/University 
PI Contribution New project funded by NWO (Netherlands Organisation for Scientific Research) awarded to my collaborator Dr Kerstin Bunte (VIDI, EURO 800k, PI Dr Kerstin Bunte, Groningen University). The project is built on the findings of the EPSRC project - Personalised Medicine through Learning in the Model Space and will further develop the ideas of formulating machine learning in the space of dynamical models.
Collaborator Contribution The Dutch partners explore the ideas of automated task-driven inferential model simplification - this was not fully developed in the original project, but is needed if the wholistic vision of the project were to be fully realised.
Impact Journal publication: K. Bunte, D.J. Smith, M.J. Chappell, Z.K. Hassan-Smith, J.W. Tomlinson, W. Arlt, P. Tino: Learning Pharmacokinetic Models for in vivo Glucocorticoid Activation. Journal of Theoretical Biology, 455, pp. 222-231, DOI:10.1016/j.jtbi.2018.07.025, 2018. This is a highly multidisciplinary project involving mathematical modelling (DJS, University of Birmingham; MJC, Warwick University), machine learning (PT, University of Birmingham; KB, Groningen University), and bio-medical sciences (WA, Z.K.H-S, JWT, University of Birmingham).
Start Year 2017
 
Description Invited Talk at IJCAI-15 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited Talk as part of the Machine Learning Track at IJCAI-15
Year(s) Of Engagement Activity 2015
 
Description Interview for the Euro News channel 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact During the final workshop for the FP7 funded EU project "AlterEgo", the Euronews channel filmed this interview. Interview for the Euro News channel feature: http://www.euronews.com/2016/10/31/avatars-help-schizophrenia-patients-silence-tormenting-voices
Year(s) Of Engagement Activity 2016
 
Description Poems Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The main aim of the event was to stimulate new collaborations within the POEMS network. In the morning of each day, there were plenary talks from invited speakers including Philip Maini (Oxford), Stafford Lightman (Bristol), and Shervanthi Homer-Vanniasinkam (Leeds). These talks were followed by a programme centred around five topic group themes, which will identify opportunities for new collaborative activity within each theme.
Year(s) Of Engagement Activity 2015
URL https://www.eventbrite.co.uk/e/poems-workshop-a-vision-for-mathematics-in-healthcare-registration-15...
 
Description Tutorial at IJCNN 2015 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact 2 hour Tutorial on Learning in Indefinite Proximity Spaces at IJCNN 2015 (Int Joint Conf on Neural Networks), Killarney, Ireland.
Year(s) Of Engagement Activity 2015
 
Description Tutorial at IJCNN 2015 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact 2 hour Tutorial on Dynamical Systems and Learning in the Model Space at IJCNN 2015, Killarney, Ireland.
Year(s) Of Engagement Activity 2015
 
Description Tutorial at AI 2013 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact 4 hour Tutorial on Theory and Applications of State Space Models for
Time Series Data at AI 2013 (Australasian Conference on Artificial
Intelligence), 3-6 December, 2013, Dunedin, New Zealand
Year(s) Of Engagement Activity 2013