Bridging systems biology and advanced computing, to realise multi-scale biological modelling.

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Biological Sciences

Abstract

The life sciences are in the midst of an unparalleled expansion, so much that the 21st-century has been termed "the century of biology". This interchange Fellowship will recruit the experience from a world-leading centre for computational science, EPCC, personified by Alastair Hume one of its senior software architects, to build a long-term partnership that maintains the U.K.'s advantage at the forefront of Biology. Such expertise is urgently required, because mathematical and computational modelling is the next driver of progress for a broad and growing swathe of life science research.

The publication of a ground-breaking "whole-cell" model (Karr et al., 2012) has galvanised this research in the field of Systems Biology, by representing the function of every gene in a (very simple) cell. This landmark in multi-scale modelling bridges from the genome to cell function. Physiological models already exist that span from cellular to organ and organism level, not only in the pioneering mammalian heart model but also in the plant-level models of crop science. Thus the "whole-cell" model is a crucial step towards a link from genotype to phenotype or from genome sequence to clinical traits. The plant models link even further, to the field scale and to larger agricultural and ecological models that routinely contribute to crop and climate forecasting.

The whole-cell model's authors are among many to stress that biologists must collaborate with computer scientists in data curation, model integration, accelerated computation and data analysis (Macklin et al.). Standards and software must constantly evolve to keep pace. Working, exemplar models are essential to specify the next-generation solutions. Researchers at SynthSys are using the "whole-cell" exemplar model and developing models for more complex cells. SynthSys was established by BBSRC and EPSRC as one of six UK Systems Biology centres, to focus on modelling cell and molecular biology. Andrew Millar's group has developed a whole-plant Framework Model of Arabidopsis growth, building on this major BBSRC investment. It is not only one of the very few exemplars that can be used to specify future computational infrastructure, but is also supporting a growing link from fundamental Systems Biology to Crop Science.

We are privileged to work with Alastair Hume, who ideally meets the urgent needs of our current research. Hume has over 20 years' experience in software development in industry and academia. While working at EPCC over the last 14 years, he has led the design and implementation of scalable data integration, processing and analysis systems for a variety of research fields ranging from astronomy to environmental modelling and prediction. In the recent EU BonFIRE project, for example, he led a team of up to 20 software developers working on future cloud computing.

Hume has already demonstrated his potential in joint, pilot projects with SynthSys and he is now poised to make a long-term contribution. The Fellowship is essential to realise that interdisciplinary conversion, allowing him to train in systems biology including an internship at the EBI, to engage with the broader systems biology modelling community, as well as addressing demonstration projects in three areas. Our multiscale models explicitly aim to link researchers from different scientific backgrounds, who must work together to provide the data, the modelling components and validation experiments. The Fellowship will provide time for this crucial networking, in the UK and USA, and with our partners at Rothamsted Research and Simulistics Ltd.

This Fellowship will allow SynthSys and EPCC jointly to link our work to international leaders in both computational resources and in biological modelling, creating a commanding position for the U.K.'s research and Alastair Hume in particular, at the nexus of three fast-moving areas: plant systems biology, cyber-infrastructure and multi-scale crop modelling.

Technical Summary

This proposal will engage and adapt the skills of a software architect with 20 years' experience, Alastair Hume, to the modelling of complex biological systems. Multi-scale, whole-organism models have transformative potential and have just become achievable in Systems Biology. SynthSys has developed the leading example for Arabidopsis. These models emphasise computational challenges, requiring expert informaticians who understand biological modelling. Hume has already contributed to several informatics projects with biological researchers, attracting him to this emerging area. His broad skills and experience could clearly meet this community's growing computational needs. However, the short-term joint projects allow no time for the immersion, training and community engagement that are required for him to lead the provision of advanced tools for biological modelling.

Therefore, firstly, Hume will develop a broader understanding of systems biology modelling, through formal training and an internship at EBI, including multi-scale, whole-plant modelling. Secondly, Hume will engage with the relevant research communities to understand the scientific questions being asked, the methods being adopted, the computational infrastructure required, and to build a collaborative network with current providers. Finally, by the development of demonstration systems Hume will contribute tools and services that will support ongoing research in multi-scale modelling, with a view to generally-applicable results fostered through interaction with crop and ecosystem modelling, via Rothamsted Research, Simulistics Ltd and Millar's international network. The overall aim is that by combining the acquired skills and knowledge with his existing skill set, Hume will play a key role between EPCC and SynthSys, realising the advanced computing infrastructure required to support "digital organism" research for many years to come.

Planned Impact

The Fellowship will develop the skills of Alastair Hume, allowing him to become a leading contributor to the emerging generation of complex biological models, and in particular their informatics infrastructure. This research is at the "bleeding edge" of current capabilities, where new technical challenges in the research methodology require an effort comparable to the underlying biological research. This Fellowship will help to bring this area to the leading edge, where it can inform progress broadly across biology and biotechnology. Thus the prime beneficiaries will be the resource providers who can re-use our results and the modellers whose work is facilitated, and these are inevitably in the academic communities.

In addition to the academic work, contract modelling offers commercial income, both in SynthSys and to our partners in Simulistics Ltd, with corresponding benefits to our clients in biotechnology and ecosystem management, respectively. Millar's work with Mendel Biotechnology Inc was published (Pokhilko et al., 2011) and contributed to his current models, for example. The new tools will allow us to model more complex systems, more efficiently, adding realisable commercial value to the dual-expertise of our staff and for client firms. Incorporating these features into future versions of Simulistics' Simile software will likewise increase its value.

The broadest benefits are expected from the applications of future, improved models both to the understanding of natural systems and their engineering in synthetic biology. Model simulations might become key educational tools, facilitating interactive training by collating the knowledge and understanding across multiple biological specialisms, without students needing to understand how the models were generated or simulated. This would be particularly easy to implement for focussed areas in a commercial setting, for example specialised training by a biofuel company on just the biological systems of interest for their production system. More widely, 'whole-cell' models are widely expected to allow a new generation of microbial re-engineering, for bioremediation, biofuel or fine chemical production, and future applications in stem cell research. In this case, the model is crucial to allow a rational 'design' stage of the Synthetic Biology production cycle. SynthSys, the Edinburgh Genome Foundry and their commercial partners are well placed to contribute, for example through our proposed BBSRC Synthetic Biology Research Centre.

In the context of ag-biotech, there is growing interest in precision agriculture (albeit from a small base of current users), especially following recent work in 'prescriptive planting' (noted even in The Economist, 2014), including acquisitions by Monsanto and other major players. The heart of these precision systems is a model that integrates soil, yield, and agronomic data into crop growth predictions. The models are proprietary, and presumably based on statistical descriptions. There is interesting potential in future for our work to make a mechanistic link from the genome sequences of crop varieties to the field traits relevant in these models. This could create a new commercial niche in technology provision, contributing to enhance UK economic competitiveness.
 
Description Alastair Hume's Fellowship has transformed his understanding of biology, especially molecular and cellular biology, through taking the proposed MSc modules and curating mathematical models into the SBML format.

His experience of advanced computation brought several new perspectives to our work. Most tellingly, one pilot project proposed was to extend a community-standard modelling format, SBML, to make it compatible with a model that it cannot currently encode, named the Arabidopsis Framework Model. This required SBML to
1) Add external data
2) Create dynamic sub-models
3) Execute functions over collections of sub-models
He concluded that this is technically possible, as we envisioned in the application. For the first item, he wrote a software tool to add external data into SBML models, which is now made freely available (on github:
https://github.com/allyhume/SBMLDataTools).
These tools have now been re-used by other researchers in SynthSys working in a completely different field, the real-time manipulation of single yeast cells in microfluidics.
However, he also concluded that the proposed approach is premature, for social reasons: in brief, standards are less relevant at the leading edge of a field. The SBML community does not extend the standard unless the features in question are already supported in at least two modelling software packages, and there is concrete demand for exchange of relevant models between the packages. In our case, the Framework Model is a new type of model that very few labs use, so neither the tools nor the demand exist.

This contribution is now published in Millar et al. J. Exp. Bot 2019.

These findings were also applied in the processing of meteorological data as input to a derivative of the Framework Model, which allowed us to simulate Arabidopsis growth in various locations around Europe over several decades. These results indicated how well adapted the simulated Arabidopsis varieties were to the conditions at different locations. This contribution is also now published in Zardilis et al. J. Exp. Bot (2019).

Mr. Hume's other contributions in this context included the SBML curation of published models from other labs, implementation of another modelling package to help others in the lab (SloppyCell), encoding a model to extend the Framework Model (now pursued by other researchers in the Millar group) and coding an online simulation interface for the Framework Model v2 (see separate entry).

The second pilot project emerged from Mr. Hume's discussion with other researchers in the Systems Biology centre, SynthSys. With Dr. Filippo Menolascina, he has performed in-silico investigations to assist in the design of a closed-loop approach to identify optimal experimental manipulations, in synthetic biology studies aiming to efficiently characterise promoters. These computational investigations gave us a much better understanding of the best configuration for forthcoming in-vitro experimentation, where real-time data processing will be used to control the experimental equipment.
Exploitation Route Extended simulation of the Framework Model has illustrated how a model with molecular mechanisms could link to ecological data, going beyond our initial goals. That vision is now key to a community initiative, Crops in silico, described in several recent publications including Millar et al. 2019.

The promoter characterisation work is a widespread requirement in Synthetic Biology, which is relevant to current commercial collaborators SynPromics.
Sectors Agriculture, Food and Drink,Environment,Pharmaceuticals and Medical Biotechnology

 
Description This discovery science furthered the training of the senior software developer Alastair Hume, bringing him new understanding in molecular and cell biology, biological modelling, and in the software and modelling practices of systems biology. These biological skills will allow his institution, EPCC, to undertake further projects in the rapidly-expanding fields of systems and synthetic biology. This was most immediately expressed in his follow-on EPSRC project collaborating with the University of Exeter, on solving the parameter optimisation problem in systems biology modelling. Training highly-skilled personnel is a key impact of our inter-disciplinary research.
Sector Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software)
 
Description Millar and Halliday labs collaboration with Takato Imaizumi 
Organisation University of Washington
Country United States 
Sector Academic/University 
PI Contribution Models, modelling methods, research questions
Collaborator Contribution Data, experimental protocols, research questions
Impact Publications - Song, Smith et al. Science 2012 Seaton et al. Mol Syst Biol 2015 Grants applications - BBSRC-NSF 2014 and 2015, not funded. BBSRC ISIS travel award 2015, funded. Interdisciplinary mix of experimetnal and modelling methods.
Start Year 2011
 
Description Millar lab collaboration with BioModels team at EBI 
Organisation EMBL European Bioinformatics Institute (EMBL - EBI)
Country United Kingdom 
Sector Academic/University 
PI Contribution Curation of mathematical models in SBML for the Biomodels database, by Ally Hume, FLIP Fellow
Collaborator Contribution Training in curation of mathematical models in SBML for the Biomodels database. Understanding of social norms and practices in SBML community.
Impact Curated models. Ally Hume's training in Systems Biology. Multi-disciplinary between Biology and Computing Science.
Start Year 2015
 
Title Online simulator for the Arabidopsis Framework Model v2 
Description Online simulator for the Arabidopsis Framework Model v2, as described in Chew et al. bioRxiv 2017, http://doi.org/10.1101/105437 
Type Of Technology Webtool/Application 
Year Produced 2016 
Impact Supported presentations at multiple, international conferences. 
URL http://turnip.bio.ed.ac.uk/fm/
 
Title SBMLDataTools 
Description The software extends a community-standard modelling format, SBML, to make provide a timeseries of input data that can be used during simulation of the model. This was required to extend SBML for a model that it could not currently encode, named the Arabidopsis Framework Model, which requires input of updated weather data at each simulation timestep. TAlly Hume (EPCC, University of Edinburgh) wrote this prototype software tool to add external data into SBML models, which is now made freely available (on github: https://github.com/allyhume/SBMLDataTools). The tools have been re-used by other researchers in SynthSys working in a completely different field, the real-time manipulation of single yeast cells in microfluidics. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact re-use for further research. The tools are broadly applicable to SBML representation of models in many research fields.