The Parameter Optimisation Problem: Addressing a Key Challenge in Computational Systems Biology

Lead Research Organisation: University of Exeter
Department Name: Engineering Computer Science and Maths

Abstract

In recent years, advances in both the physical and life sciences have increasingly come from the collaborations of researchers across disciplines, and the development and use of tools from a range of areas. A prototypical example of this interdisciplinary approach to science is systems biology, the field concerned with quantifying how the interaction of individual system components control biological function and behaviour. Systems biology has become increasingly quantitative, with a shift from diagrammatic representations of interaction networks to sets of mathematical equations that model (i.e. simulate) how the concentrations of molecular species vary with time. A key advantage of such models is that they can be used to predict how the networks they represent will respond to specific perturbations, such as changes in environmental conditions (e.g. temperature) or the addition of pharmacological agents. The ability to easily generate such predictions reduces the need for large numbers of expensive and time-consuming experiments.

However, the more complex a biological network is, the more complex the corresponding model needs to be, and the greater the range of possible biological behaviours that can be exhibited. This means that extensive computer simulations are needed to adjust the parameters controlling the model so as to accurately reproduce (i.e. fit) the experimental behaviour observed. For biologically realistic models which can involve hundreds of different molecular species, the number of simulations required to adjust the parameters of a given model to achieve the optimal fit to data can be prohibitively large, far exceeding that which is possible on practical timescales. Thus, for the predictive power of mathematical models to be fully realised in the systems biology domain, methods are required that allow this parameter optimisation procedure to be carried out in a computationally efficient manner.

The proposed project will address this need by bringing state-of-the-art methods from computer science to bear on the problem, which have been successfully applied previously to highly parametrised problems like aircraft conflict alert systems, design optimisation of lightweight materials and routing of mesh sensor networks (amongst others). In addition, we propose to develop new methods specifically engineered for the systems biology domain that can provide insight into model behaviour, beyond simply returning a single estimate of the best fit parametrisation (e.g. methods for identifying parameters yielding equally good fits to data, and also parameters which simultaneously fit the model to data generated in diverse experimental conditions). As part of this, we will develop a package of open source software tools that will be embedded within a software infrastructure designed for systems biologists, enabling the methods developed in this work to be readily applied to problems in the field that are currently computationally intractable.

To test and refine the algorithms developed, they will be applied to the gene network that generates circadian oscillations (the circadian clock) in the key plant species Arabidopsis thaliana, for which high-quality experimental data recorded in a range of genetic and environmental backgrounds is available, together with a suite of mathematical models of varying complexity. As part of this work, biochemically detailed models of the clock will be directly fitted to multiple experimental datasets for the first time, yielding models with greater predictive power. Many processes critical for plant growth and reproduction are regulated by the clock (e.g. photosynthesis and flowering time). In the long term, the ability to optimise plant models of increasing complexity with the class of methods we will develop here may thus help predict how the viability of economically important crop species will be affected by future temperature shifts resulting from climate change.

Planned Impact

Economic Impact

Systems biology is one of the UK government's core research themes, with significant potential in economically critical areas such as medicine, biofuels and crop breeding. The Royal Academy of Engineering and the Academy of Medical Sciences have described systems biology as a vehicle for "advancing knowledge and building the nation's wealth", highlighting the construction of predictive mathematical models of complex dynamical networks as a fundamental objective. For the field to be successful in exploiting mathematical modelling approaches to understand and design biological systems, it is critical to develop model-fitting techniques that can: (i) cope with highly parametrised systems (i.e. large numbers of biochemical species); and (ii) combine the information provided by multiple experimental datasets. This project addresses this fundamental need, and although the empirical evaluation of the project outputs focuses on circadian clock networks, the algorithms developed and software released can be broadly applied across the systems biology domain. The workshops delivered as part of the project will explicitly encourage the uptake of the developed tools by industrial and academic researchers from diverse application areas.

The outcomes of the proposed work will therefore impact on the ability of systems biology researchers in academia and industry to produce technologies that deliver a more sustainable and healthy future. As a specific example of this potential impact, recent results show that plant breeders have unwittingly modified several clock genes in major crops, including wheat and barley, over hundreds to thousands of years. This process progressively adapted flowering and harvest times to more northerly latitudes, allowing the cereal crops to spread across Europe from the Near East, and by extension facilitated the population expansion in these areas. The clock genes may well be needed again, as climate change brings new combinations of temperature and photoperiod, affecting phonological timing in both agricultural and natural ecosystems. It is anticipated that the optimisation methods developed during the project will be subsequently employed to construct large-scale temperature-dependent models of the plant clock that will be able to simulate such genetic and environmental changes. These models may thus help predict how future climate shifts will affect the ability of crops to survive, grow and reproduce, and the corresponding optimal genetic modifications for sustaining viable yields.

There are therefore many potential economic beneficiaries of this research. Companies (and universities) working in the medical, energy and agricultural sectors can exploit the outputs of this project to fit and interrogate their models, as part of their product development. Governments and the public will benefit in areas such as food security and energy security (through e.g. improvements in crop yield/resistance and biofuels). Consumers will benefit in a range of areas, from cheaper food and fuel to advances in health.

Social Impact

To engage groups outside academia and industry with the research, public engagement activities will be organised to highlight the societal and economic implications of the project outputs, and, more broadly, the key role played by mathematics and computer science in addressing 21st century challenges. In addition, project results will be incorporated into taught undergraduate and postgraduate courses at Exeter and Edinburgh, highlighting the extent to which computational techniques are crucial to cutting-edge research in systems biology. Students on these modules will also benefit from being exposed to the various job opportunities that exist in the biomedical and biotechnology sectors for graduates from the physical sciences. Outreach activities will also be organised for local schoolchildren in Exeter, showcasing how mathematics is used to model natural phenomena.

Publications

10 25 50
 
Title Local Optima Network Generator (Java package) 
Description Software package provides core tool and interfaces for generating Local Optima Networks for fitness landscape analysis, using either exact or approximate methods. (Generic) implementation in the Java language. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact I have been made aware that a team from the Department of Decision Sciences, University of South Africa (led by Dr Malan) is using this library for investigating feature selection problems, and also to pipeline with visualisation code from the University of Stirling (Prof. Ochoa). 
URL https://github.com/fieldsend/local_optima_networks
 
Description Organised the Workshops on Evolutionary Algorithms for Problems with Uncertainty (at GECCO 2018 in Kyoto, and GECCO 2019 in Prague) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Project team (plus Prof Juergen Branke from the University of Warwick) organised the first Workshop on Evolutionary Algorithms for Problems with Uncertainty, disseminating new work on coping with uncertainty and noise in optimisation. Workshop was a success with roughly 40 academics and practitioners participating, and will be running again in 2019.
Year(s) Of Engagement Activity 2018,2019
URL http://eapwu.ex.ac.uk/