Software infrastructure to support the standard of model curation and annotation MIRIAM

Lead Research Organisation: European Bioinformatics Institute
Department Name: Computational Neurobiology Group

Abstract

Systems Biology recently emerged as on of main field in life sciences. A crucial component of this discipline is the development of quantitative models to simulate the biological processes. Scientists need to store and exchange those models to leverage on each-other expertise and production. A prerequisite is the standardisation of model's syntax and content. In collaboration with the community of modellers and developers, we developed a standard, MIRIAM, that precises what mandatory information authors have to add to a quantitative model in order to facilitate its understanding by other scientists. MIRIAM is made up of two parts: 1) a set of rules controlling the correspondance between the model and its description in the scientific literature; 2) A set of annotation describing the generation of the model, and documenting each of its elements. For MIRIAM to be effectively adopted, we need to provide the computing resources necessary to its implementation. We will build a database of officialy endorsed data resources. This database will provide for each data resource all the details necessary to the implementation of the standard. We will develop a simple interface to facilitate the automatic access to the database. Software libraries will be provided to implement the access to the interface and to facilitate the development of third party software. In collaboration with the California Institute of Technology and the Bioengineering Institute of Auckland we will work to expand the support of the standard MIRIAM by the modelling languages SBML and CellML. This will allow a significant fraction of modelling tools to produce MIRIAM-compliant models. The endorsement of public standards by scientists, developers and publishers will be a crucial step toward the effective storage, exchange and reuse of quantitative models in Biology, one of the cornerstone of Systems Biology. The wide acceptance of MIRIAM will only be achieved through the availability of easy to use supporting softwares and resources such as those presented in this project.

Technical Summary

To improve the exchange and interpretation of quantitative models, we developed the Minimal Information Requested In the Annotation of Models (MIRIAM), a standard that summarises the mandatory information authors have to add to a model. For MIRIAM to be effectively adopted, we need to provide the computing resources necessary to its implementation. 1) We will build a relational database (using MySQL) of officialy endorsed data resources, with all the necessary information to access and retrieve information from them. This database will provide, for each agreed-upon data resource, all the details necessary to the implementation of MIRIAM: standard URIs, corresponding physical URLs, names of resource, description of the syntax to build identifiers and hyperlinks etc. 2) In order to facilitate the programmatic access to the database, we will develop a simple Application Programming Interface. This API will be available through WebServices. The server side will be developed using Java Servlet and the Axis engine of the Apache project. We will provide libraries implementing this access to facilitate the development of MIRIAM clients. 3) In collaboration with the SBML-team at the California Institute of Technology, and the CellML team at the Bioengineering Institute of Auckland, we will work to expand the support of MIRIAM annotation in SBML and CellML. In particular, we will expand the software library libSBML to support MIRIAM annotation. This will allow all the software using libSBML (which means a significant fraction of SBML-supporting tools) to produce MIRIAM-compliant models.

Publications

10 25 50
 
Description The Registry provides the necessary information for the generation and resolution of unique and perennial identifiers for life science data. Those identifiers are of the URI form and make use of Identifiers.org to provide direct access to the identified data records on the Web. The infrastructure developed with this funding is now an important brick of the ELIXIR infrastructure.
Exploitation Route MIRIAM Registry is completely open, and can be accessed via web browsing, programmatically through web-services or dowloaded locally.
Sectors Agriculture, Food and Drink,Healthcare,Pharmaceuticals and Medical Biotechnology

URL http://identifiers.org
 
Description * Consistent annotations of all computational models encoded in SBML, CellML, SED-ML, SBGN-ML * Methods to compare, cluster and retrieve computational models based on cross-references * Use of consistent URIs through many semantic-web resources in life sciences * Consistent cross references in the EMBL-EBI RDF pilot project
First Year Of Impact 2006
Sector Agriculture, Food and Drink,Healthcare,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Title Identifiers.org 
Description Identifiers.org is a system providing resolvable persistent URIs used to identify data for the scientific community, with a current focus on the Life Sciences domain. The provision of a resolvable identifiers (URLs) fits well with the Semantic Web vision, and the Linked Data initiative. 
Type Of Material Improvements to research infrastructure 
Year Produced 2011 
Provided To Others? Yes  
Impact * Consistent annotations of all computational models encoded in SBML, CellML, SED-ML, SBGN-ML * Methods to compare, cluster and retrieve computational models based on cross-references * Use of consistent URIs through many semantic-web resources in life sciences * Consistent cross references in the EMBL-EBI RDF pilot project 
URL http://identifiers.org
 
Title MIRIAM Registry 
Description The MIRIAM Registry provides a set of online services for the generation of unique and perennial identifiers, in the form of URIs. It provides the core data which is used by Identifiers.org. The core of the Registry is a catalogue of data collections (corresponding to controlled vocabularies, databases, ...), their URIs and the corresponding physical URLs (or resources). These resources are monitored daily to ensure data accessibility and the validity of the resolution mechanism. 
Type Of Material Data handling & control 
Year Produced 2006 
Provided To Others? Yes  
Impact * Consistent annotations of all computational models encoded in SBML, CellML, SED-ML, SBGN-ML * Methods to compare, cluster and retrieve computational models based on cross-references * Use of consistent URIs through many semantic-web resources in life sciences * Consistent cross references in the EMBL-EBI RDF pilot project 
URL http://www.ebi.ac.uk/miriam