Building a global metagenomics portal ('MGportal') to handle next-generation sequencing data and associated metadata

Lead Research Organisation: University of Oxford
Department Name: Oxford e-Research Centre

Abstract

While genomes represent the full genetic (DNA) complement of a single organism, metagenomes represent the DNA of an entire community of organisms. These organisms might be free-living in the environment, or be found on the skin or in the gut of a human being or other species. Microbial organisms play a major role in our everyday health and well-being, which is not surprising when you consider that the number of microbial cells in or on an average human body actually exceeds the number of human cells! Microbes play a similarly important role in the environment; different types of organisms live under different conditions (including extreme habitats, such as the run-off from acid mines or the depths of the oceans). Understanding how these organisms have adapted to their various living conditions will lead to a better understanding of how changes in the environment will have impact on biodiversity in the future. It may also lead to discovery of entirely new species or novel proteins which could have utility as antibiotics or other drugs. Combined with other types of 'omic data, metagenomes hold the promise of unparalleled insights into fundamental questions across a range of fields including evolution, ecology, environment biology, health and medicine. To fully exploit the promise of these data we need both scientific innovation and community agreement on how to provide appropriate stewardship of these resources for the benefit of all. Significant numbers of metagenomics projects have been awarded grants by international funding bodies. Whilst all of these projects have specific, scientifically-interesting aims, they mostly exist in isolation, with little or no cross-referencing to other metagenomic or genomic datasets. Our intention is to leverage existing infrastructure to deliver a world-class metagenomics resource with unique utility for UK-based metagenomics researchers. This resource, MGportal, will utilise user-friendly interfaces, state-of-the-art algorithms and the EBI's unique position as a hub of biological information to measurably enhance the value of these researchers' data. It will be built in close collaboration with the Genomic Standards Consortium (GSC). MGportal will consist of software tools to enable metagenomics researchers to upload their data to the raw nucleotide sequence archives, data analysis pipelines to predict what potential genes are present in the data and what their function is, plus a web interface which will display these data and results in a way that is easy to browse and query. We will hold training courses and a workshop to gain input from the scientific community about the portal. It is hoped that MGportal will eventually allow researchers to understand the results of their metagenomics experiments, as well as seeing how those results compare with the outcomes of other studies.

Technical Summary

While genomes represent the full genetic (DNA) complement of a single organism, metagenomes represent the DNA of an entire community of organisms. Interest in improved sampling of diverse environments (e.g. hosts/gut, plants, soil, etc) combined with advances in the development and application of ultra-high throughput sequence methodologies is set to vastly accelerate the pace at which new metagenomes are generated. Combined with other types of 'omic data, metagenomes hold the promise of unparalleled insights into fundamental questions across a range of fields including evolution, ecology, environment biology, health and medicine. To fully exploit the promise of these data we need both scientific innovation and community agreement on how to provide appropriate stewardship of these resources for the benefit of all. In this three year collaborative project we aim to build an international data resource and portal for metagenomic data at the European Bioinformatics Institute. This portal will manage the submission, storage, dissemination and mining of metagenomic data from data providers across the world. The portal will focus on the capture of rich in contextual information (metadata), working in close collaboration with the Genomic Standards Consortium (GSC) an international working body creating and implementing standards to describe genomes, metagenomes and marker gene sequences. Further, the collaborative use of the ISA Infrastructure software suite for metadata capture will enable capture and sharing of standards compliant data and integration with a range of other data types. The resulting MGPortal will be a major new resource at the EBI. The combined MGPortal Team will engage in a range of community-building activities, including hosting workshops and training activities that both educate data submitters and users and will ensure the portal develops in line with community needs.

Planned Impact

The full impact of this work is described in the impact statement of the lead institute, the EBI. Here we elaborate on the specific impact of the work to be completed in this project under the auspices of the Genomic Standards Consortium and the ISA Infrastructure project. The primary impact of the proposed tight collaboration between these groups and the EBI is the increased level of community involvement in the creation of resources that serve community needs. This is a pioneering aspect of this proposal. Community-level consensus: This project will help to continue fund these key grass-roots activities, thus strengthening them and their ability to give a voice to the wider scientific community on issues of data stewardship, standardization and sharing. Specifically, this project will directly fund core activities with the GSC (i.e. through Peter Sterk's role as Secretary of the GSC) and most importantly provide funds to implement GSC recommended standards and the international level. This is a key step on the path towards international adoption of standards that will underpin future data sharing. It will also ensure the usage of a premier example of standards-compliant tools in the creation of this portal. The ISA Infrastructure, already funded by the BBSRC in the past BBR round, is a complete suite of tools for capturing and disseminating standards-compliant metadata. Its use in this project paves the way for universal sharing of metadata about sampled and data types as this work will increase the chances that other projects will adopt this shared aprpoach. Data Sharing. The adoption of these community-defined approaches is also in direct support of the strong BBSRC data sharing policy. Putting this standards-compliant infrastructure into place will ensure compliance with policy of making data freely available in re-useable form. Policy makers. The production of more-richly annotated bioinvestigations will improve the evidence base for policy makers by providing greater interpretability of experimental context, simplifying the job of data integration and study comparison. More detail for those forming policy on biological and biomedical issues should produce better decisions. Journals. The current trend shows that, like funders, journals increasingly require that firstly, researchers make more of their data public, for example by submitting it to public repositories, and that secondly, they begin to comply with community-defined standards. However 'non-compliance' may be difficult to overcome: experimental metadata are still normally sparse in publications and the supplementary data that sometimes accompany them, limiting data accessibility and utility. This is because of the lack of (i) reviewer time and expertise - they are not trained to check compliance, (ii) awareness of the existence of an appropriate reporting standards, (iii) access to freely available tools implementing standards, and (iv) adequate data management resources at the local and community levels. Greater automation of the reporting processes is required. The only feasible solution is better annotation and education at source (i.e., by providing data producers with a straightforward way in which to use community annotation standards), assisted by some form of automated content validation. Through this collaboration we will disseminate this best practice by building compliance with standards into the MGPortal. Outreach. The high profile nature of this project (a major new database/portal at the EBI) will help to spread the word about the importance of standards in the community. Finally, the planned workshops and interactions with the existing GSC and ISA communities with succeed in engaging a larger proportion of bench scientists in efforts to provide the best possible stewardship of our collective data assets.

Publications

10 25 50
publication icon
Ho Sui S (2013) The Stem Cell Commons: an exemplar for data integration in the biomedical domain driven by the ISA framework. in AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

publication icon
Brandizi M (2012) graph2tab, a library to convert experimental workflow graphs into tabular formats. in Bioinformatics (Oxford, England)

publication icon
Maguire E (2013) OntoMaton: a bioportal powered ontology widget for Google Spreadsheets. in Bioinformatics (Oxford, England)

publication icon
Emami Khoonsari P (2019) Interoperable and scalable data analysis with microservices: applications in metabolomics. in Bioinformatics (Oxford, England)

publication icon
González-Beltrán A (2014) linkedISA: semantic representation of ISA-Tab experimental metadata. in BMC bioinformatics

publication icon
Salek RM (2013) The MetaboLights repository: curation challenges in metabolomics. in Database : the journal of biological databases and curation

publication icon
McQuilton P (2016) BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences. in Database : the journal of biological databases and curation

publication icon
Gaudet P (2011) Towards BioDBcore: a community-defined information specification for biological databases. in Database : the journal of biological databases and curation

 
Description We have contributed to the development of a public repository for metagenomics data at the EBI; specifically, we have refined a set of tools to help researchers to collect, annotate and submit their datasets to this repository.
Exploitation Route This is a public data deposition service and the tools are freely available to researchers for their continued use in managing and sharing their metagenomics datasets.
Sectors Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Education,Pharmaceuticals and Medical Biotechnology

URL https://www.ebi.ac.uk/metagenomics
 
Description The portal is maturing and currently serves as a key community portal for this dat type. Its use will continue to increase the effectiveness of data sharing and the reuse.
First Year Of Impact 2013
Sector Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Pharmaceuticals and Medical Biotechnology
Impact Types Cultural

 
Description COpenPlantOmics (COPO): a Collaborative Bioinformatics Plant Science Platform
Amount £1,000,000 (GBP)
Funding ID BB/L024101/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 01/2015 
End 12/2018
 
Description EC H2020 - INFRADEV-3-2015 - ELIXIR EXCELERATE
Amount € 240,000 (EUR)
Organisation European Commission 
Department Horizon 2020
Sector Public
Country European Union (EU)
Start 09/2015 
End 08/2019
 
Description ISA-InterMine: accelerating and rewarding data sharing
Amount £1,174,660 (GBP)
Funding ID 208381/A/17/Z 
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 08/2018 
End 07/2021
 
Title BioSharing 
Description Registry of standards and databases linked to data policies by funders and journals. 
Type Of Material Improvements to research infrastructure 
Year Produced 2011 
Provided To Others? Yes  
Impact Launched in 2011, the BioSharing portal (https://biosharing.org) of interrelated standards, databases, and policies has 53,741 users and is a resource of the ELIXIR UK Node and the ELIXIR Interoperability Platform. Endorsed by a community of 68 organizations, including publishers (embedded in the data policies of 600 Springer Nature's journals, also PloS, EMBO press, BMJ, F1000Research, BioMedCentral, Oxford University Press, Wellcome Trust Open Research), standardization groups, and research data management support initiatives and libraries (such as those at JISC, Stanford, Cambridge and the Oxford Universities). 
URL http://biosharing.org/
 
Title Continued improvements to the ISA toolkit 
Description Started in 2003 and first released in 2007, the ISA tools have been developed over time by the Oxford team and collaborators or directly contributed by partnering contributors, via the ISA Commons collaborative community. Short description of the developments and achievements of the resource over the last year: • Awarded Wellcome Trust funds (2018-2021), as collaborative project with the University of Cambridge's InterMine team to link the two resources and reward researchers for annotating and publish FAIR data; also, ISA is embedded in two ELIXIR Implementation Studies, on a Plant-focused data validation and on metabolomics. • With the uptake of ISA-Galaxy tools (https://github.com/ISA-tools/isatools-galaxy) and integration with the Galaxy Framework, ISA has struck a major milestone by showcasing how prospective data management can be done, demonstrating a full deposition workflow to Metabolights and creating training material (10.7490/f1000research.1115757.1). • Jupyter notebooks (https://github.com/ISA-tools/dtp-isa-exercises) have been developed as teaching material to showcase the use of ISA-API in various context to undergraduate and postgraduate courses on data readiness. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact Community use and impact is tracked via the ISA Commons, which currently has over 40 international groups, projects, and organizations that use and contribute to the development of components of the ISA metadata tracking framework. Therefore, we can say that the ISA user base ranges from hundreds to thousands of researchers from increasingly diverse domains (ranging from -omics, cell-based research, biomedical nanotechnology, plant phenotyping, toxicology, biodiversity, metagenomics, stem cell research, system biology, neuroscience, microbial science and immunology), and goes beyond researchers, curators, others resource developers and service providers, to also include journals. For example, ISA is used by the University of Oxford' GigaScience and underpins Springer Nature's Scientific Data data journal, supporting intelligent data sharing and credit; ISA is used to describe the experiment and to provide browse and search functionality for Scientific Data's content (http://scientificdata.isa-explorer.org). The ISA framework is currently embedded in a number of UK, EC and NIH and pharma funded infrastructure and research projects; here are exemplars from the ELIXIR UK Node and other Nodes: o EMBL-EBI MetaboLights' new web-based submission relies on ISA-JSON format to build web component and on the ISA-API to validate, convert experiments represented in ISA objects. o BBSRC-funded COPO infrastructure relies on the ISA API, ISA-JSON serialization and on the ISA configurations to support plant-based experiment molecular profiling experiments; it also used the ISAconverter to deposit to the ENA database. o ELIXIR-UK Node partners, University of Birmingham and Imperial College London use ISA Galaxy Tools, ISA-API and ISA validator - as part of their work in the UK Phenome Centre - to collect data prospectively but also organise public deposition to repositories. o ELIXIR Plant Community's MIAPPE standards and BrAPI rely on availability of ISA parsers and validation tools in the context of data validation programs. 
URL http://isa-tools.org
 
Title Continued improvements to the ISA toolkit and the new Datascriptor component 
Description Started in 2003 and first released in 2007, the ISA tools (http://isa-tools.org) have been developed over time by the Oxford team and collaborators or directly contributed by partnering contributors, via the ISA Commons collaborative community (https://www.isacommons.org). Key work over the last year is the development of a new component, the Datascriptor: https://datascriptor.org, as part of the Wellcome Trust award (2018-2021), a collaborative project with the University of Cambridge's InterMine team. Leveraging our experience and links with the communities, we are designing an open-source web-based tool - part of an ecosystem of existing annotation and authoring systems - to help researchers to use community standards to describe their (meta)data at the source, and capitalize on their effort to accelerate the creation of a data article. In addition major advances have been made to the ISA API also working with the ELIXIR Plant and Metabolomics communities. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact Community use and impact is tracked via the ISA Commons, which currently has over 40 international groups, projects, and organizations that use and contribute to the development of components of the ISA metadata tracking framework. Therefore, we can say that the ISA user base ranges from hundreds to thousands of researchers from increasingly diverse domains (ranging from -omics, cell-based research, biomedical nanotechnology, plant phenotyping, toxicology, biodiversity, metagenomics, stem cell research, system biology, neuroscience, microbial science and immunology), and goes beyond researchers, curators, others resource developers and service providers, to also include journals. For example, ISA is used by the University of Oxford' GigaScience and underpins Springer Nature's Scientific Data data journal, supporting intelligent data sharing and credit; ISA is used to describe the experiment and to provide browse and search functionality for Scientific Data's content (http://scientificdata.isa-explorer.org). The ISA framework is currently embedded in a number of UK, EC and NIH and pharma funded infrastructure and research projects; here are exemplars from the ELIXIR UK Node and other Nodes: (i) EMBL-EBI MetaboLights' new web-based submission relies on ISA-JSON format to build web component and on the ISA-API to validate, convert experiments represented in ISA objects. (ii) BBSRC-funded COPO infrastructure relies on the ISA API, ISA-JSON serialization and on the ISA configurations to support plant-based experiment molecular profiling experiments; it also used the ISAconverter to deposit to the ENA database. (iii) ELIXIR-UK Node partners, University of Birmingham and Imperial College London use ISA Galaxy Tools, ISA-API and ISA validator - as part of their work in the UK Phenome Centre - to collect data prospectively but also organise public deposition to repositories. (iv) ELIXIR Plant Community's MIAPPE standards and BrAPI rely on availability of ISA parsers and validation tools in the context of data validation programs. 
URL https://datascriptor.org
 
Title ISA tools 
Description Tools to collect, annotate, store, share and publish datasets 
Type Of Material Improvements to research infrastructure 
Year Produced 2010 
Provided To Others? Yes  
Impact Running since 2007, the open source metadata reporting ISA software suite has a user base ranging from hundreds to thousands of users from diverse domains (http://isa-tools.org), and is a resource of the ELIXIR UK Node. Currently it is embedded in 27 public resources (institute-based, project/consortium-based or global repositories, including some based at EBI, in USA, Japan, China and Australia), supports two data-driven journals (Springer Nature Scientific Data, Oxford University Press GigaScience), and complements 9 internal data platforms (also at the FDA National Centre for Toxicological Resources and Janssen R&D)- http://www.isacommons.org. The extension of the ISA metadata representation format for nanotechnology applications became a formal ASTM standard in 2013. 
URL http://www.isa-tools.org
 
Description ELIXIR Interoperability Platform and ISA 
Organisation ELIXIR
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution ISA is part of the ELIXIR Recommended Interoperability Resources (RIRs) to facilitate interoperability and reusability of life science data and support the principles of FAIR data management.
Collaborator Contribution The ELIXIR Recommended Interoperability Resources have been selected by external panel of reviewers, based on the selection criteria published in the Call for RIR application, which measure how they facilitate scientific research and how they improve FAIRness of life science data.
Impact ISA is and will continue to be used by and further developed with ELIXIR communities, especially with Plant and Metabolomics use cases.
Start Year 2018
 
Description ELIXIR UK Node 
Organisation Earlham Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation Heriot-Watt University
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation Imperial College London
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation Newcastle University
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation Rothamsted Research
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University College London
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University of Birmingham
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University of Cambridge
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University of Dundee
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University of Edinburgh
Department Edinburgh Genomics
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University of Edinburgh
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University of Liverpool
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University of Manchester
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ELIXIR UK Node 
Organisation University of Oxford
Country United Kingdom 
Sector Academic/University 
PI Contribution Help create the ELIXIR UK Node
Collaborator Contribution Contribute to the creation of the ELIXIR UK Node
Impact Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year 2012
 
Description ISA Commons 
Organisation ISA Commons
Sector Charity/Non Profit 
PI Contribution We have helped many users, service providers and other developers to implement one or more components of the ISA software suite at their site to fit their data needs.
Collaborator Contribution They have helped us to refine the ISA software suite, filling gaps and tuning it for certain data types.
Impact The ISA Commons is a growing ecosystem of institute-based (e.g. USA NASA GeneLab Data Repository) and global repositories (e.g. EMBL-EBI MetaboLights), as well as data-driven journals (e.g. Springer Nature Scientific Data) that use the ISA formats, and/or are powered by one or more component of the ISA software suite. But also grass-root standards groups that leverage on the ISA data model and formats. The sustainability and maintenance of the ISA data model, formats, and tools, is guided by the ISA Working Group.
Start Year 2010
 
Title Datascriptor 
Description From structured dataset to data article. Leveraging our experience and links with the communities, we are now designing an open-source web-based tool - part of an ecosystem of existing annotation and authoring systems - to help researchers to use community standards to describe their (meta)data at the source, and capitalize on their effort to accelerate the creation of a data article. The user will be guided to provide (semi)structured descriptions of the experimental design, and of the post-processed data, to generate, respectively, the Methods and a set of statements to populate the Results section of a manuscript. Datascriptor will work: (i) as a stand-alone tool - for anyone to use - implementing generic metadata models, such as W3C Data Catalog vocabulary; and (ii) as a component of the ISA Tools - for its user communities - implementing the ISA metadata model. To output short sentences from the (semi)structured input, we will evaluate a mixed data-to-text approach using template-based and neural-based (i.e. machine learning) methods. To further enrich the content of the manuscript, Datascriptor will connect to existing authoring systems, including Substance, Texture, Stenci.la and Manuscripts, and export the result in JATS format. Our plans also include an export as a DAR file and in LaTeX format. 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact Work has just started, but to ensure continued impact in the stakeholder community, the Datascriptor User Advisory Board includes a core group of existing collaborators: Thomas Lemberger (EMBO Press), Scott Edmunds (GigaScience), Holly Murray ( F1000), Varsha Khodiyar (Springer Nature). 
 
Title ISA Model and Serialization 
Description The original ISA-Tab specification was published as a Release Candidate document in 2008, documenting the initial work that forms the ISA framework, with a further update in 2009. Since then, we have done work on a new serialization in JSON, ISA-JSON, and abstracted out the data model from both the tabular and JSON formats. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact Serialisations implemented by several ISA components; the documentation also helps other users to implement ISA formats. 
URL http://isa-tools.org/2016/10/release-of-the-isa-specs/
 
Title ISA software suite (built iteratively, component by component) 
Description The open source ISA framework and tools help to manage an increasingly diverse set of life science, environmental and biomedical experiments that employing one or a combination of technologies. Built around the 'Investigation' (the project context), 'Study' (a unit of research) and 'Assay' (analytical measurement) data model and serializations (tabular, JSON and RDF), the ISA framework helps you to provide rich description of the experimental metadata (i.e. sample characteristics, technology and measurement types, sample-to-data relationships) so that the resulting data and discoveries are reproducible and reusable. 
Type Of Technology Software 
Year Produced 2010 
Open Source License? Yes  
Impact Growing number of users, as listed at http://isacommons.org; but also of co-developers have and are contributing to the collaborative enhancements. 
URL http://isa-tools.org/
 
Title ISA tooling for the metabolomics community 
Description A new set of ISA software tools have been developed out of the EU H2020 PhenoMeNal: Large-Scale Computing for Medical Metabolomics project (http://phenomenal-h2020.eu/home). The ISA team has been contributing to the project since 2015, and has been collaborating on the development of user-facing, cloud-based data management and processing infrastructure in the project. The PhenoMeNal software includes a new set of ISA-related Galaxy workflow tools, as well as native support for the ISA-Tab format in Galaxy. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact The tools work with the EBI MetaboLights database as well as with ISA-Tab studies uploaded directly into the Galaxy platform, and builds on the Python ISA-API. The Metabolights' use of ISA-API: Python-based REST service relying on the ISA-API https://github.com/EBI-Metabolights/MtblsWS-Py 
URL http://isa-tools.org/2018/03/isa-galaxy-developed-for-metabolomics/
 
Title ISA-API Python library 
Description Project name: ISA-API Project home page: http://github.com/ISA-tools/isa-api Operating system(s): Platform independent Programming language: Python 3 Other requirements: None License: CPAL-1.0 ISA-API, a Python library that supports the creation, editing, parsing, and validatiation of both ISA-Tab and ISA-JSON formats, using a common data model implemented as native Python objects. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact This provides users with a common interface and interoperable medium between the two ISA formats, as well as conversion to a set of other formats required for depositing data in public databases. 
 
Description Biohackathon; ELIXIR, Paris 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The team participated to several tracks, especially working on ISA for plant and metabolomics community, as well as for use in Galaxy, and the bioschema work. The work carried our continue to embed ISA and FAIRsharing into ELIXIR-driven infrastructure and activities.
Year(s) Of Engagement Activity 2018
URL https://www.elixir-europe.org/events/biohackathon-2018-paris
 
Description Datascriptor hackathon - eLife Innovation Sprint 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Hackathon on the Datascriptor prototype, part of the ISA toolkit. Datascriptor aims to taking the pain out of beginning to write papers, making it easy to automatically generate the parts of a paper that can be easily scaffolded and incentivising reproducible papers by ensuring the scaffolds include well-structured data and metadata. During the online event the prototype was fleshed out by user testing with hands-on use cases.
Year(s) Of Engagement Activity 2020
URL https://sprint.elifesciences.org/data-paper-skeleton-tools-for-life-sciences/
 
Description Poster presentation: ISAcreate and Galaxy; Galaxy conference, Portland 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact ISA-Tab format is now used by Galaxy tools; the discussion helped ensuring the uptake continue
Year(s) Of Engagement Activity 2018
URL https://gccbosc2018.sched.com/event/FEWs/g26-isacreate-a-galaxy-tool-for-prospective-data-management...