Software infrastructure to support the standard of model curation and annotation MIRIAM

Lead Research Organisation: European Bioinformatics Institute

Department Name: Computational Neurobiology Group

Abstract

Systems Biology recently emerged as on of main field in life sciences. A crucial component of this discipline is the development of quantitative models to simulate the biological processes. Scientists need to store and exchange those models to leverage on each-other expertise and production. A prerequisite is the standardisation of model's syntax and content. In collaboration with the community of modellers and developers, we developed a standard, MIRIAM, that precises what mandatory information authors have to add to a quantitative model in order to facilitate its understanding by other scientists. MIRIAM is made up of two parts: 1) a set of rules controlling the correspondance between the model and its description in the scientific literature; 2) A set of annotation describing the generation of the model, and documenting each of its elements. For MIRIAM to be effectively adopted, we need to provide the computing resources necessary to its implementation. We will build a database of officialy endorsed data resources. This database will provide for each data resource all the details necessary to the implementation of the standard. We will develop a simple interface to facilitate the automatic access to the database. Software libraries will be provided to implement the access to the interface and to facilitate the development of third party software. In collaboration with the California Institute of Technology and the Bioengineering Institute of Auckland we will work to expand the support of the standard MIRIAM by the modelling languages SBML and CellML. This will allow a significant fraction of modelling tools to produce MIRIAM-compliant models. The endorsement of public standards by scientists, developers and publishers will be a crucial step toward the effective storage, exchange and reuse of quantitative models in Biology, one of the cornerstone of Systems Biology. The wide acceptance of MIRIAM will only be achieved through the availability of easy to use supporting softwares and resources such as those presented in this project.

Technical Summary

To improve the exchange and interpretation of quantitative models, we developed the Minimal Information Requested In the Annotation of Models (MIRIAM), a standard that summarises the mandatory information authors have to add to a model. For MIRIAM to be effectively adopted, we need to provide the computing resources necessary to its implementation. 1) We will build a relational database (using MySQL) of officialy endorsed data resources, with all the necessary information to access and retrieve information from them. This database will provide, for each agreed-upon data resource, all the details necessary to the implementation of MIRIAM: standard URIs, corresponding physical URLs, names of resource, description of the syntax to build identifiers and hyperlinks etc. 2) In order to facilitate the programmatic access to the database, we will develop a simple Application Programming Interface. This API will be available through WebServices. The server side will be developed using Java Servlet and the Axis engine of the Apache project. We will provide libraries implementing this access to facilitate the development of MIRIAM clients. 3) In collaboration with the SBML-team at the California Institute of Technology, and the CellML team at the Bioengineering Institute of Auckland, we will work to expand the support of MIRIAM annotation in SBML and CellML. In particular, we will expand the software library libSBML to support MIRIAM annotation. This will allow all the software using libSBML (which means a significant fraction of SBML-supporting tools) to produce MIRIAM-compliant models.

Funded Value:

£82,764

Funded Period:

Sep 06 - Sep 07

Funder:

BBSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

BB/E006248/1

Principal Investigator:

Nicolas Le Novere

Research Topic:

Unclassified

Organisations

European Bioinformatics Institute (Lead Research Organisation)

People	ORCID iD
Nicolas Le Novere (Principal Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Henkel R (2010) Ranked retrieval of Computational Biology models. in BMC bioinformatics

Herrgård MJ (2008) A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. in Nature biotechnology

Jupp S (2014) The EBI RDF platform: linked open data for the life sciences. in Bioinformatics (Oxford, England)

Juty N (2013) Towards the Collaborative Curation of the Registry underlying identifiers.org in Database

Juty N (2012) Identifiers.org and MIRIAM Registry: community resources to provide persistent identification. in Nucleic acids research

Laibe C (2007) MIRIAM Resources: tools to generate and resolve robust cross-references in Systems Biology. in BMC systems biology

Li C (2010) BioModels.net Web Services, a free and integrated toolkit for computational modelling software. in Briefings in bioinformatics

McMurry JA (2017) Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data. in PLoS biology

Schulz M (2011) Retrieval, alignment, and clustering of computational models based on semantic annotations. in Molecular systems biology

Stanford NJ (2015) The evolution of standards and data management practices in systems biology. in Molecular systems biology

Key Findings
Impact Summary
Research Databases and Models
Research Tools and Methods


Description	The Registry provides the necessary information for the generation and resolution of unique and perennial identifiers for life science data. Those identifiers are of the URI form and make use of Identifiers.org to provide direct access to the identified data records on the Web. The infrastructure developed with this funding is now an important brick of the ELIXIR infrastructure.
Exploitation Route	MIRIAM Registry is completely open, and can be accessed via web browsing, programmatically through web-services or dowloaded locally.
Sectors	Agriculture, Food and Drink,Healthcare,Pharmaceuticals and Medical Biotechnology
URL	http://identifiers.org


Description	* Consistent annotations of all computational models encoded in SBML, CellML, SED-ML, SBGN-ML * Methods to compare, cluster and retrieve computational models based on cross-references * Use of consistent URIs through many semantic-web resources in life sciences * Consistent cross references in the EMBL-EBI RDF pilot project
First Year Of Impact	2006
Sector	Agriculture, Food and Drink,Healthcare,Pharmaceuticals and Medical Biotechnology
Impact Types	Economic


Title	Identifiers.org
Description	Identifiers.org is a system providing resolvable persistent URIs used to identify data for the scientific community, with a current focus on the Life Sciences domain. The provision of a resolvable identifiers (URLs) fits well with the Semantic Web vision, and the Linked Data initiative.
Type Of Material	Improvements to research infrastructure
Year Produced	2011
Provided To Others?	Yes
Impact	* Consistent annotations of all computational models encoded in SBML, CellML, SED-ML, SBGN-ML * Methods to compare, cluster and retrieve computational models based on cross-references * Use of consistent URIs through many semantic-web resources in life sciences * Consistent cross references in the EMBL-EBI RDF pilot project
URL	http://identifiers.org


Title	MIRIAM Registry
Description	The MIRIAM Registry provides a set of online services for the generation of unique and perennial identifiers, in the form of URIs. It provides the core data which is used by Identifiers.org. The core of the Registry is a catalogue of data collections (corresponding to controlled vocabularies, databases, ...), their URIs and the corresponding physical URLs (or resources). These resources are monitored daily to ensure data accessibility and the validity of the resolution mechanism.
Type Of Material	Data handling & control
Year Produced	2006
Provided To Others?	Yes
Impact	* Consistent annotations of all computational models encoded in SBML, CellML, SED-ML, SBGN-ML * Methods to compare, cluster and retrieve computational models based on cross-references * Use of consistent URIs through many semantic-web resources in life sciences * Consistent cross references in the EMBL-EBI RDF pilot project
URL	http://www.ebi.ac.uk/miriam

Abstract

Technical Summary

Organisations

People

ORCID iD

Publications