Reproducible development of mathematical models of cardiac electrophysiology

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

The aim of "systems biology" is to understand how biological systems (e.g. cells, organs, people) work as a collection of parts by using mathematical modelling. We describe the behaviour of the parts in the form of mathematical equations using the laws of physics and chemistry, and then see how the behaviour of larger systems emerges from this. Many systems biology models for specific components have been published, but there remain significant challenges in exploiting them to understand systems as a whole. Which existing models (if any) are most appropriate for a new scientific question? How does each model behave in different situations? How well do they capture what the real systems do? At present it is very difficult to answer these questions without first downloading each model, writing programs to perform different simulated experiments, and then writing more code to compare and visualize the results. There has been nowhere to look up even simple properties for different models. We propose to build upon a pilot implementation of a system that enables such tasks to be done automatically, with results published on a website.

Our approach will be demonstrated in perhaps the most mature area of systems biology: the electrical activity of heart cells, for which the first model was published in 1960 and well over 100 models are now available in public databases. The models have been hugely important in giving insights into how the heart works (and what can go wrong due to disease, age or drug side effects) and have helped in developing new treatments.

The first step is to compare the different model predictions to measurements from real cells, to tell us whether we really understand how the heart's cells work. We will link real measurements to our recipes for performing equivalent experiments on the computer models, and provide an interface to display the results, indicating how well they agree both qualitatively (general appearance) and quantitatively (how well the numbers match).

Cardiac models often have dozens of equations containing hundreds of parameters - key numbers governing how the models behave. How these parameters were worked out from experimental recordings (data) is, more often than not, unclear. Since many models reuse components from previous models, the original methods and data may no longer be available to anyone. This causes big problems for building on these models - if for instance we want to adapt a model to a new cell type, there is no record of which experiments were performed, or how these were analysed to produce the parameters and equations in the final model. We will extend our recipes to capture this information as well, and so be able automatically to re-calibrate a model to a given set of experiments.

Crucially, our tools will use the variability in experimental measurements to calculate how models are likely to need to change to capture variations between different cells in a heart, or between different people, and explain how this variation affects predictions.

Three case studies will drive development, looking at different kinds of model to give a broad picture of needs. Feedback from the wider community and an external advisory board of experts in cardiac electrophysiology will also be incorporated, building on the success of our first user workshop in September 2015.

The final output will be a user-friendly online system - a "Cardiac Electrophysiology Web Lab" - providing an open community resource for researchers to use. We will also write training materials and run further workshops to help these researchers use it. Our resource will make it easier to reuse or extend existing models in appropriate ways, to develop new models, and to understand differences between heart cells. The tools will increase the impact of modelling for replacing animal experimentation and testing, e.g. in drug trials.

Technical Summary

It is said that all models of biology are wrong, but some may still be useful. A key challenge for systems biology is to characterise how wrong different models are, and what they are useful for. Models are developed to answer specific scientific questions, and the process of model selection, parameterisation and evaluation is typically manual and laborious. There is no straightforward means to determine which (if any) model is the most appropriate to answer a new question, or be used as a component in a larger model. We aim to make the process of model development documented, automated and repeatable, so that models can easily be tested and updated to incorporate new data. This will be demonstrated in the systems biology sub-domain of cardiac electrophysiology.

We have prototyped a fully-open online resource that can automatically apply any experimental protocol to any myocyte model, making it easy to assess what behaviours a model is capable of producing (or what responses can be elicited given relevant stimuli) and so compare different models (i.e. hypotheses). Here we propose to take this to full maturity by linking experimental data to protocol descriptions, enabling researchers to assess the extent to which different models can capture experimentally observed properties, both qualitatively and quantitatively depending on available data. We will then automate the process of parameterisation of cardiac cell models from data by using state-of-the-art Bayesian inference methods. The result will be a community resource for cardiac researchers to develop, reproduce, and compare mathematical models of cardiac cell electrophysiology with full confidence in the phylogeny and robustness of those models.

Work will be based around the needs of three case studies: models of human contraction, the fast and late sodium currents, and hERG channel currents. We will incorporate feedback from an expert panel (including Denis Noble and Yoram Rudy) and the wider community.

Planned Impact

Addressing the lack of reproducibility of both wet and dry biomedical research is of vital importance to the pharmaceutical, biotechnology and biomedical device industries, and hence to society. Lack of reproducibility has been highlighted in many recent studies (with special issues in Science [1] and Nature [2]) and perhaps most graphically in the reported failure by biotechnology company Amgen to reproduce the results in 47 out of 53 papers describing landmark cancer studies [3]. It is estimated that each failed attempt by a pharmaceutical company to reproduce the results in a published paper costs that company between $500K and $2M [4], and has led some industry figures to suggest that if industry-sponsored research within universities turns out not to be reproducible, then funding for that research should be returned to the industry sponsor [5].

As illustrated in [6] lack of reproducibility also afflicts cardiac electrophysiology models. Our new approach will address this issue by bringing together (for the first time) the experimental data used to build and validate these models within a fully-open community platform allowing models to be "engineered" in a fully rigorous and reproducible manner. Since these models are about to be incorporated into the regulatory framework for the safety testing of all new drugs via the FDA's Comprehensive in-vitro Proarrhythmia Assay (CiPA) initiative, this work is particularly timely and will have enormous societal and economic impact.

This impact extends to all uses of cardiac models which depend on reliable predictions. For instance, clinical electrophysiology laboratories routinely use simulation studies to understand and optimise novel methods and devices. Patient-specific computational meshes can now be derived from MRI and CT imaging, and patient-specific models of myocardial infarction areas with damaged electrophysiological properties are incorporated into these models. Trials are underway at Johns Hopkins University to see whether simulations can predict the best sites for ablation treatments.

We will build on existing links with the pharmaceutical industry and the FDA to achieve direct impact during the project. Our previous work with the FDA influenced the introduction of the new CiPA initiative, and the Web Lab will support its implementation. Our work with GSK and Roche has direct impact on drug development, streamlining safety assessment by providing more accurate assessment of pro-arrhythmic risk earlier in the drug discovery process.

Three complementary approaches will be taken to long-term sustainability and impact of this resource. Firstly, we will facilitate the user community becoming self-supporting. Secondly, Web Lab will be fully integrated with existing resources such as the Physiome Model Repository, becoming part of their curation process, funded through long-term research infrastructure support. Thirdly, we will work with publishers to provide added value for papers describing models. These latter options are being investigated with Peter Hunter at the University of Auckland, who is a member of our advisory board. We are also discussing the publishing option with Oxford University Press.

More broadly, the Web Lab approach will provide an illustrative exemplar of how rigorous and automated model engineering can enable, support and sustain the reproducibility of computational models in systems biology, and that can be built upon in other application domains.

[1] http://www.sciencemag.org/site/special/data-rep
[2] http://www.nature.com/news/reproducibility-1.17552
[3] Raise standards for preclinical cancer research. Nature 483:531-533
[4] The Economics of Reproducibility in Preclinical Research. PLoS Biol 13(6):e1002165
[5] An incentive based approach for improving data reproducibility. Sci Transl Med 8:336ed5
[6] The Cardiac Electrophysiology Web Lab. Biophys J 110:292-300
 
Description The project is in its second year. We have been able to establish a blueprint the project, had initial discussions with our potential collaborators, and have written a first paper (which is under review). the publication describing our open source software package (Probabilistic Inference on Noisy Time Series - PINTS) - see URL below - is nearing completion.
Exploitation Route We will build on it ourselves initially but it should then be of interest to the research community in general
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

URL https://github.com/pints-team/pints
 
Title PINTS 
Description (Probabilistic inference for Noisy Time Series) PINTS is an open source software package that is being developed for embedding within the weblab but is also freely available on GitHub. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact Impact to date is mainly on our own work. We did come second using the software in a machine learning conference. I anticipate greater impact next year. 
URL https://github.com/pints-team/pints
 
Description University College London. 
Organisation University College London
Department Joint Research Office
Country United Kingdom 
Sector Academic/University 
PI Contribution Prior to the commencement of the grant the Researcher Co-Investigator (Dr Jonathan Cooper) was appointed to a post in the Research Software Development Group at UCL. With EPSRC's permission we therefore set up a collaboration with UCL. Roughly speaking the Computational Biology Group in Oxford is undertaking the scientific research, and the UCL group is undertaking the infrastructure research and development.
Collaborator Contribution Prior to the commencement of the grant the Researcher Co-Investigator (Dr Jonathan Cooper) was appointed to a post in the Research Software Development Group at UCL. With EPSRC's permission we therefore set up a collaboration with UCL. Roughly speaking the Computational Biology Group in Oxford is undertaking the scientific research, and the UCL group is undertaking the infrastructure research and development.
Impact A first paper is under review and a workshop will take place in June 2018.
Start Year 2017
 
Title PINTS 
Description (Probabilistic inference for Noisy Time Series) PINTS is an open source software package that is being developed for embedding within the weblab but is also freely available on GitHub. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact Mainly on our own work to date. 
URL https://github.com/pints-team/pints
 
Description HARMONY Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact It is organised and will happen in June
Year(s) Of Engagement Activity 2018
URL http://co.mbine.org/events/HARMONY_2018