The Modelling Apprentice: A tool to aid the formation of cell signalling models

Lead Research Organisation: Aberystwyth University
Department Name: Computer Science

Abstract

The impact of computer science technology in microbiology has lead to the creation of online databases which now contain complete genome sequences for several hundred organisms, as well as detailed information for a wide variety of cell processes. Computers can also act as simulators to model the dynamic behaviour of these processes and the interactions between them. Simulation can provide guidance to scientists in the selection of useful experiments and can also provide predictions where experimentation is costly and difficult to perform. Systems biology is a rapidly advancing science that aims to capture knowledge of these processes and interactions and the creation of simulation models is a central activity. A medium term goal is the construction of a model of the whole cell, where the interactions of systems that are normally studied separately can be analysed. Computational Scientific Discovery is another emerging discipline where techniques from Artificial Intelligence (AI) are used to automate or greatly ease the difficult process of translating experimental results and data into scientific knowledge. This is especially important as the quantity of data far exceeds the ability of unaided human interpretation. In terms of systems biology scientific discovery often involves the construction and validation of computer models that provide explanations of experimental results. It is important that the resulting model accurately explains the results and is also biologically valid, i.e. the knowledge makes sense to a human expert. Machine Learning, a branch of AI, has seen the development of computer programs that can generate explanations from data. The last decade or more has seen increasing use of machine learning techniques for the acquisition of biological knowledge. However, a major drawback, preventing even wider acceptance of computational scientific discovery by the more general biology community, is the learning curve necessary for efficient use of the techniques and technology. Many systems biology scientists find it necessary to become experts in the mathematics of machine learning and model simulation as well as being experts in cell biology. The Modelling Apprentice seeks to overcome these obstacles by providing an easy to use, understandable tool to aid the construction, validation and improvement of biological models by removing the need for the scientist to understand or even interact with the underlying mathematical knowledge representation and machine learning. This is achieved by; 1) an intuitive graphical user interface where molecular and chemical interactions are displayed explicitly, and 2) separation of the scientific knowledge from the machine learning techniques that reason with the knowledge. The second of these also allows the Modelling Apprentice to be easily adapted to investigate other scientific applications by constructing a library that acts as a plug-in. The Modelling Apprentice will seek to improve the newly developed program Justaid - which already incorporates these features. As a test case, a model of the MAPK cell signalling network of yeast will be built using knowledge from expert biologists in Cambridge and Aberdeen. Cell signalling is the process by which cells respond to external and environmental stimuli and study of these networks is crucial to the understanding of human diseases such as cancer, diabetes, and immune and degenerative disorders. Modelling of cell signalling has also not progressed as fast as other biological processes such as metabolism. Suitability of the Modelling apprentice and the new MAPK model library will then be assessed by expert biologists who will use it to evaluate their latest experimental results. Insights gained from this testing will be used to further improve the Modelling Apprentice.

Technical Summary

This proposal aims to exploit and enhance the newly developed software system 'Justaid' to create the 'Modelling Apprentice', a tool to assist biological researchers interested in developing accurate and understandable mathematical models of cell signalling networks. Justaid is a general purpose programming assistant for scientific discovery that uses techniques from Qualitative Reasoning (QR), Machine Learning and User Interface design to to simplify the task of model construction, testing and updating. QR is chosen as the knowledge representation because domain knowledge can be acquired easily and quickly from experimental observations that lack precision. Machine Learning is used to determine whether a given model can fully explain a set of experimental observations, and suggest model updates, extending the explanatory power of the model to hitherto unexplained observations. Justaid has been designed to remove the need for the scientist to understand the underlying knowledge representation and machine learning, so that he/she can concentrate on the scientific domain rather than on mathematical concepts. This is achieved by an intuitive graphical user interface and a modular architecture whereby construction of a model for a new scientific domain simply involves creating a new Justaid library. The first task involves the construction of a Justaid library for cell signalling, using the yeast MAPK signalling pathway as a test case. This will involve expert biologists from Centre for Systems Biology, University of Cambridge and the SABR centre at the University of Aberdeen. The Modelling Apprentice and the new MAPK model will then be used by the biologists, in conjunction with the results of wet lab experiments, to test the utility an appropriateness of Justaid as a scientific discovery tool. The insight from these tests will be used to further improve the user interface design and model updating techniques of the Modelling Apprentice

Publications

10 25 50

publication icon
Sparkes A (2010) An Integrated Laboratory Robotic System for Autonomous Discovery of Gene Function in Journal of the Association for Laboratory Automation

 
Description We developed a novel way to identify/learn systems biology models.
Exploitation Route Learning systems biology models in a key part of modern biology.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description I believe that it has been used by Tata consultancy.
First Year Of Impact 2012
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic