myGrid: A Platform for e-Biology Renewal

Lead Research Organisation: University of Manchester
Department Name: Computer Science

Abstract

SummaryThe myGrid Consortium is a multi-institutional, multi-disciplinary, internationally leading research group focusing on the challenges of e Science-the use of computational resources that allows scientists around the world to collaborate to produce and analyse the vast amounts of complex data in disciplines as diverse as biology, chemistry, astronomy, physics, music and social science. This platform grant enables the consortium to sustain an internationally leading team of researchers working on the foundations of e-Science. The consortium delivers e-Laboratory environments in which scientists perform virtual or in silico experiments. The consortium's flagship tools include Taverna, myExperiment and Utopia. Taverna is used to develop the scientific workflows that scientists use to gather and analyse data - these represent the experiments on, for example, the genes and proteins involved in diseases. The myExperiment Virtual Research Environment is social web site software for the social curation and sharing of scientific research objects, including workflows and in silico experiments. UTOPIA is a suite of scientific visualisation and analysis tools that brings together disparate data sources in an easy to use unified interface. Together these enable scientific investigations to be undertaken in a way that enables the scientist to concentrate on the science, a feat that requires basic research in computer science.These E-Science tools are world leaders with 1000, 900 and 2000 users respectively - Taverna is used in some 350 organisations. Producing these tools necessitates foundational e-Science research in four main areas: the management of the knowledge in such environments; the production and management of the metadata, or descriptions, of the experiments and experimental holdings; the design, use and reuse of in silico experiments; and the exploitation of social networks to enhance e-Science. Explicitly engaging with users supports adoption, and it drives challenging, user-relevant research and development based on observed experience and real need. The platform grant enables the consortium to retain key staff that help sustain this world leading effort in e-Science and Open Science - they are experts in scientific workflow management, semantic technologies, intelligent middleware and social computing. Crucially it also supports our participation on the international stage, and it allows pump-priming novel and innovative research projects that are the hallmark of the consortium.

Publications

10 25 50

publication icon
Alper P (2017) Static analysis of Taverna workflows to predict provenance patterns in Future Generation Computer Systems

publication icon
Bechhofer S (2013) Why linked data is not enough for scientists in Future Generation Computer Systems

publication icon
Cao B. (2009) Semantically annotated provenance in the Life Science Grid in CEUR Workshop Proceedings

publication icon
Chen J (2014) DistillFlow

publication icon
Ciccarese P (2013) PAV ontology: provenance, authoring and versioning. in Journal of biomedical semantics

publication icon
De Roure D (2010) Towards open science: the myExperiment approach in Concurrency and Computation: Practice and Experience

publication icon
Frey JG (2011) Web-based services for drug design and discovery. in Expert opinion on drug discovery

publication icon
Frey JG (2013) Cheminformatics and the Semantic Web: adding value with linked data and enhanced provenance. in Wiley interdisciplinary reviews. Computational molecular science

publication icon
Garijo D (2014) Common motifs in scientific workflows: An empirical analysis in Future Generation Computer Systems

publication icon
Gibson A (2009) The data playground: An intuitive workflow specification environment in Future Generation Computer Systems

publication icon
Goderis A (2009) Benchmarking workflow discovery: a case study from bioinformatics in Concurrency and Computation: Practice and Experience

publication icon
Goderis A (2009) Heterogeneous composition of models of computation in Future Generation Computer Systems

publication icon
Kanza S (2023) Digital research environments: a requirements analysis in Digital Discovery

publication icon
Missier P (2011) Search Computing

publication icon
Missier P (2011) Workflows to open provenance graphs, round-trip in Future Generation Computer Systems

publication icon
Moreau L (2011) The Open Provenance Model core specification (v1.1) in Future Generation Computer Systems

publication icon
Möller S (2010) Community-driven computational biology with Debian Linux. in BMC bioinformatics

publication icon
Sroka J (2010) A formal semantics for the Taverna 2 workflow model in Journal of Computer and System Sciences

 
Description The myGrid Consortium (http://www.mygrid.org.uk/), established by an RCUK e-Science Programme Pilot project, is a multi-institutional, multi-disciplinary, internationally leading research group focusing on the challenges of e Science. We specialise in data and knowledge intensive e Laboratories. An e Laboratory is a set of components (workflows, resources, data, algorithms, texts, queries, people) that are used together to form a distributed and collaborative space for e Science and e Scientists (such as bioinformaticians), enabling the planning, execution, analysis and publication of in silico experiments and their results. The myGrid Consortium (http://www.mygrid.org.uk) became one of the most successful of the RCUK e-Science Programme pilots in basic and applied research, research dissemination and user adoption, focusing within its membership on the Life Sciences and e Health. The e Science Platform: myGrid: A Platform for eBiology was awarded to provide us with continuity and stability in order to exploit the success of the consortium (see EP/C536444/1).

The myGrid: A Platform for eBiology renewal had the objective of sustaining our position as a world-leading consortium, by (i) ensuring the continuity and stability of the team, retaining key researchers and sustain a platform for research; (ii) building research capacity and career development of the researchers; (iii) developing, incubating and pump-priming foundational and applied investigations, and identifying and stimulating new and long term research and innovation opportunities; (iv) building national and international technical and scientific collaborations; (v) fostering the exploitation of our research in practice, and leveraging research in other projects and initiatives; and (vi) winning further funds.

As a platform grant this award is inherently partnered with other awards, and this is reflected in the outcomes.

The Platform findings:
- continuity and stability to the consortium: 28 researchers were supported. The platform bridged or pump-primed between 14 different funding streams/projects. 10 were still in the consortium at the end of the award and on new projects partially arising from the platform and affiliated projects.

- capacity for national and international research: 2 of the researchers supported by the platform directly went onto a university faculty positions and continue to collaborate: Belhajjame (Paris 6), Missier (Newcastle). This award effectively allowed them to pump-prime their careers and establish themselves as well connected and independent researchers.

3 went onto further research in other institutes (Eales, Lister, Cruickshank) and 4 went into industry (Li, Rodda, Withers, Newman) and 1 went onto further study followed by industry (Aleksejvs).

- widened the awareness, adoption and influence of our software (Apache Taverna Workflow Suite, UTOPIA, myExperiment, BioCatalogue, SEEK4Science, LabTrove etc) co-ordinating a coherent programme of work into an eLaboratory toolkit, and conducting a virtuous circle between research and production, whereby research investigations are inspired from real problems and outcomes are returned into the products.

- continuity and innovations in core open source and widely used software toolkits and applications produced by the group for the benefit of the international research community. The group operates a virtuous circle between research and production, whereby research investigations are inspired from real problems and outcomes are returned into the products. The platform grant:

* retained and bridged across funding streams key people : the Taverna Workflow system (Williams, Fellows, Haines, Withers), myExperiment (Cruikshank, Bacall), BioCatalogue (Beard), SEEK4Science (Owen) and LabTrove (Borkum). It also enabled research innovations outlined below to be transferred into these systems: workflow provenance, research objects, micropublications, semantic annotations and software ontologies. All the software listed above has high impact - see separate entries in the Software Outcomes - garnered additional funds, and underpins European projects such as Wf4ever, BioVeL, SCAPE, VPH and HELIO.

* enabled us to undertake fundamental developments in the Taverna workflow system that could not be funded elsewhere but were necessary for it long-term adoption and sustainability, notably the complete migration of the platform to OSGI-based plugin (this means that it can now become an open development project) by Withers and the prototyping of OAuth security by Borkum.

* Bridged the whole UTOPIA team for a short but crucial time, enabling the knowledge and skills to be retained. UTOPIA Documents is now an established open source system with over 5000 users and commercial uptake amongst a small group of publishers. It is also a popular interface to the IMI Open PHACTS linked data warehouse for drug discovery.

* enabled investigations into specific customisations of e-Labs and their components, and needs of different scientific communities, notably: chemistry, systems biology, biodiversity and public health. This lead to follow-on projects to sustain the group. Notably Li investigated the customisation and usability of workflows in chemistry and drug discovery: he is now lead at BGI GigaScience Data Journal in China, responsible for the reproducibility of workflows and research objects in genomics.

- developed, incubated and pump-primed foundational and applied investigations, and identifying and stimulating new and long term research and innovation opportunities. Novel investigations afforded by the platform are concerned with computational technologies, semantic techniques and social-technical processes required to understand and enable the reproducibility of scientific research and the accelerated exchange and exploitation of scientific research. Our platforms and projects with chemists, biologists, astronomers, biodiversity scientists, librarians etc gave us a splendid opportunity for observation and experimentation.
(ii) Knowledge management of scientific assets or artefacts and metadata management in e Laboratories
(iii) Foundations of workflow systems: the semantics of workflow execution and the collection, the representation and use of workflow provenance and workflow preservation.
(iv) New models of scholarly communication and reproducibility, drawing together the all three of the themes above, and exploring the social forces at work in reproducible research. This has led to numerous invited talks and keynotes by members of the consortium in scientific, digital library and publisher conferences, and contributions to UK and EU policy documents. We co-founded the Force11 Foundation (co-authoring the manifesto).

Highlights include:
- The Software Ontology (SWO) is a description of software used to store, manage and analyze data. Recently, the SWO has incorporated EDAM, a vocabulary for describing data and related concepts in bioinformatics adopted by the ELIXIR and BioMedBridges EU RIs.

- The Research Object (RO) Framework: (http://www.researchobject.org) is a novel way of representing and managing the multi-variant and compound/interlinked nature of research artifacts. The RO Framework is a metadata model, with a collection of conventions encoded in standards, protocols and policies, with realizations using off the self and specialist software. Research objects are any digital resource that aims to go beyond the PDF for scholarly publishing, and prescribed ways of combining those resources. Several example realisations have been made for systems biology, public health, scientific workflows and publications, and physics experiments, and it is the foundation of several products (SEEK4Science), EU projects (Wf4ever, SCAPE, BioVeL, VPH-SHARE) and the basis of future work in EU wide infrastructures FAIRPORT, ELIXIR, FAIRDOM (DMMCore), the MRC Farr Institute and the USA NIH Commons. ROs have generated a significant buzz in the publishing and digital library arenas. The RO work has been undertaken in collaboration with the VU Amsterdam, U of Oxford, U of Southampton, U of Lancaster, UPM Madrid, Instituto de Astrofisica de Andalucia and ISOCO.

ROs are now in common currency in European and USA Scholarly Communications: Part of the NIH Data Commons is based on the notion of Research Objects (subcontract from Elsevier); Publishing houses and scholarly comms groups such as Force11 refer to the concepts in ROs. ROs form a core component of the European Open Science Cloud Life (EOSC Life) workflow collaboratory for life science in Europe.

- Semantic annotation of research objects using novel stand-off semantic middleware and linked data. This work is collaborative with Harvard Mass General and led to the development of the W3C Open Annotation Model and an ongoing collaboration combining Harvard's annotation platform Domeo with our UTOPIA tool. Earlier work with Pirrò (Università della Calabria) explored the role of semantics in computational middleware.

- Fundamentals of workflow provenance- the log trace of computational executions and the lineage of data products arising from computations. We made significant inroads to the representation of provenance and lead contributions to W3C PROV standard suite, established the ProvBench benchmark suite, proposed a novel mechanism for tracking data credits through workflow execution traces, and mechanisms for annotating provenance with domain semantics (with Sheth et al, Wright University).

- Fundamental work on workflows, including the specification of the formal semantics of our workflow engine (with Hidders et al TU Delft); comparisons with other workflow systems (with Foster et al Argonne Labs), automated refactoring of workflows (with Chen Boulakia, Paris-Sud), and workflow interoperability (with collaborators in USC SDSC, UC Davis, Amsterdam and Gonzaga).

-Established and strengthened links with collaborators including: Harvard Medical Centre, Argonne Labs, Paris Sud, VU Amsterdam, Wright University USA, USC, UC Davis, Amsterdam and Gonzaga, UPM, USC, British Library, NYC University, Notre Dame, GigaScience, and many more in multi-partner European projects (see follow-on funds) and USA projects such as dataONE (Missier and Belhajjame are now partners), iPlant Collaborative.

- won significant further funds for scientific applications, new research and tool and infrastructure award, notably within European Research Infrastructures and RCUK capital and infrastructure programmes such as the Farr Institutes.

- co-founded research community Force11.org.
Exploitation Route Research insights into semantic annotation, provenance and workflow design have been incorporated into practice into our various software packages, and can be in other systems.

All the software and ontologies we have produced are free, open source, and openly available on the web and in the appropriate community public archives (GitHub, BioPortal).

Taverna is now in Apache (in the Open Development Apache Incubator) - Apache is the most respected and strongest of the open source foundations.

Research Objects are proposed as a FORCE11 (http://www.force11.org, a not for profit foundation of scholars, librarians, archivists, publishers and research funders whose mission is to change toward improved knowledge creation and sharing) Working Group. We have also established a Research Object forum http://www.researchobject.org

Research Objects are proposed as a platform for the NIH BD2K Commons and the Farr Institute Commons. ROs are now in common currency in European and USA Scholarly Communications. We give 3-4 major keynotes a year on the topic. ROs are considered the gold level of reproducibility.

The Software Ontology is incorporated into EDAM and will form part of the EU ELIXIR RI tool registry bio.tools, and maybe even the NIH Software Discovery Index.
Sectors Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Education,Healthcare,Culture, Heritage, Museums and Collections,Pharmaceuticals and Medical Biotechnology

URL http://www.mygrid.org.uk
 
Description Our findings have been used extensively in a variety of ways: - our software and technical products (see separate entries) which are used by 1000s of users, and in particular have been adopted as core resources by some major EU projects, some of which we have been invited to participate in. - standards - the World Wide Web Consortium PROV model for provenance and the W3C Open Annotation Data Model. we are now active in the Research Data Alliance as another route to impact through standards. - international initiatives and infrastructures such as Force11, FAIRPORT, ELIXIR, ISBE, BioMedBridges, FAIRDOM, NIH Data Commons, Farr Institute, IBISBA and DISSCo. These are variously exploiting the work on workflows, Research Objects, semantic annotation and the software ontology. Notably the International Common Workflow Language (http://www.commonwl.org) is based on work developed in this programme.
Sector Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Digital/Communication/Information Technologies (including Software),Education,Environment,Healthcare,Culture, Heritage, Museums and Collections,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Description Annotopia
Amount £50,000 (GBP)
Organisation Massachusetts General Hospital 
Sector Hospitals
Country United States
Start 04/2015 
End 07/2015
 
Description BBSRC Crowdsourcing: The Lazarus Project: Resurrecting data and knowledge from life science articles by crowd-sourcing
Amount £481,203 (GBP)
Funding ID BB/L005298/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 06/2014 
End 07/2017
 
Description BioExcel - Centre of Excellence for Biomolecular Research
Amount € 286,000 (EUR)
Funding ID H2020-EINFRA-2015-1, 675728 
Organisation European Commission 
Department Horizon 2020
Sector Public
Country European Union (EU)
Start 11/2015 
End 10/2018
 
Description BioExcel-2 Centre of Excellence for Computational Biomolecular Research
Amount € 8,000,000 (EUR)
Funding ID H2020-EU.1.4.1.3 823830 
Organisation European Commission H2020 
Sector Public
Country Belgium
Start 01/2019 
End 12/2021
 
Description ELIXIR-CONVERGE
Amount € 5,000,000 (EUR)
Funding ID 871075 
Organisation European Commission H2020 
Sector Public
Country Belgium
Start 02/2020 
End 01/2023
 
Description EOSC-Life, Providing an open collaborative space for digital biology in Europe
Amount € 23,745,996 (EUR)
Funding ID INFRAEOSC-04-2018, H2020-EU.1.4.1.1, 824087 
Organisation European Commission H2020 
Sector Public
Country Belgium
Start 03/2019 
End 02/2023
 
Description EU FP7 IP SCAPE, Scalable Preservation Environments
Amount € 794,000 (EUR)
Funding ID 97458 
Organisation European Commission 
Department Seventh Framework Programme (FP7)
Sector Public
Country European Union (EU)
Start 12/2010 
End 11/2014
 
Description EU FP7 STREP Wf4Ever Advanced Workflow Preservation Technologies for Enhanced Science
Amount € 500,000 (EUR)
Funding ID 270192 
Organisation European Commission 
Department Seventh Framework Programme (FP7)
Sector Public
Country European Union (EU)
Start 12/2010 
End 11/2013
 
Description EU Innovative Medicines Initiative (Open PHACTS)
Amount € 1,500,000 (EUR)
Funding ID 115191 
Organisation European Commission 
Department Innovative Medicines Initiative (IMI)
Sector Public
Country Belgium
Start 03/2011 
End 01/2016
 
Description FAIR-CURES-RO
Amount $77,149 (USD)
Funding ID subcontract from FAIR4CURES NIH Data Commons 
Organisation Elsevier 
Sector Private
Country Netherlands
Start 09/2018 
End 08/2020
 
Description FP7 Infrastructures BioVel: Biodiversity Virtual eLaboratory
Amount € 1,120,000 (EUR)
Funding ID 283359 
Organisation European Commission 
Department Seventh Framework Programme (FP7)
Sector Public
Country European Union (EU)
Start 08/2011 
End 12/2014
 
Description INFRADEV-3-2015, 676559, ELIXIR-EXCELERATE
Amount € 19,000,000 (EUR)
Funding ID INFRADEV-3-2015, 676559 
Organisation European Commission H2020 
Sector Public
Country Belgium
Start 09/2015 
End 08/2019
 
Description JISC SageCite: Citing network models of disease and associated data
Amount £50,000 (GBP)
Organisation Jisc 
Sector Public
Country United Kingdom
Start 08/2010 
End 07/2011
 
Description MICA: Health e-Research Centre
Amount £4,719,858 (GBP)
Funding ID MR/K006665/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 03/2013 
End 02/2018
 
Description Open PHACTS Foundation
Amount € 100,000 (EUR)
Organisation Open PHACTS Foundation 
Sector Charity/Non Profit
Country United Kingdom
Start 10/2014 
End 09/2015
 
Description SYNTHESYS PLUS Synthesis of systematic resources
Amount € 11,325,200 (EUR)
Funding ID INFRAIA-01-2018-2019, H2020-EU.1.4.1.2, 823827 
Organisation European Commission H2020 
Sector Public
Country Belgium
Start 02/2019 
End 01/2023
 
Title Common Workflow Language 
Description The Common Workflow Language (CWL) is a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry. CWL is developed by a multi-vendor working group consisting of organizations and individuals aiming to enable scientists to share data analysis workflows. The CWL project is maintained on Github and follows the Open-Stand.org principles for collaborative open standards development. Legally CWL is a member project of Software Freedom Conservancy and is formally managed by the elected CWL leadership team, however every-day project decisions are made by the CWL community which is open for participation by anyone. We are founding members of CWL; co-developed the specification, develop tools such as the CWL Viewer (View.commonwl.org) and promote the adoption of CWL internationally particularly through our EU projects in the Life Sciences. CWL is based on work we undertook in the Taverna Workflow system, developed in the OMII project and developed since in EU projects. 
Type Of Material Improvements to research infrastructure 
Year Produced 2016 
Provided To Others? Yes  
Impact Its the Common Workflow Language adopted extensively by the commerical and and open source sector across disciplines, and in particuar adopted by the Life Sciences. Will be the basis of the European Open Science Cloud Life Science Collaboratory (EOSCLife) and several other ESFRIs. 
URL http://commonwl.org
 
Title myExperiment 
Description First and arguably only public sharing platform for computational workflows that supports over 20 workflows management systems. this resource was further developed in this award 
Type Of Material Database/Collection of data 
Year Produced 2008 
Provided To Others? Yes  
Impact First and arguably only public sharing platform for any workflow system. Over 500 citations (combined, google scholar) of 3 main myExperiment papers. on 13/03/2017 myExperiment has: 10472 registered members, 392 groups, 3811 workflows, 1223 files, 470 packs Used by several EU projects (e.g. BioVeL, SCAPE, HELIO, VPH), US (e.g. FLOSS) and companies (e.g. RapidMiner) as their workflow repository. over 22 workflow systems represented in repository. in the 30 days in Oct 2014, 2391 unique users (logged in and anonymous), which we can extrapolate. 
URL http://www.myexperiment.org
 
Description Leiden University 
Organisation Leiden University
Country Netherlands 
Sector Academic/University 
PI Contribution Leiden are our co-development partners of the metadata platform for the SEEK4Science Data and Model Management platform, associated software and curation, and co-partners in the delivery of the FAIRDOM data and model stewardship programme.
Collaborator Contribution Leiden are our co-development partners of the metadata platform for the SEEK4Science Data and Model Management platform, associated software and curation, and co-partners in the delivery of the FAIRDOM data and model stewardship programme in the Netherlands. Representatives of the DTL (Dutch TechCentre for Lifesciences).
Impact See outcomes of SysMO-DB2 and BB/M013189/1 DMMCore: Data and Model Management Core for ERASysAPP; Europe
Start Year 2013
 
Description Oxford eResearch Centre 
Organisation University of Oxford
Department Oxford E-Research Centre
Country United Kingdom 
Sector Academic/University 
PI Contribution - For the FAIRDOM project (BBSRC SysMO-DB1, SysMO-DB2, DMMCore) we use the OeRC ISA Structure as the backbone of the SEEK4Science software we use for the FAIRDOMHub Commons. - For ELIXIR-UK, OeRC are partners in metadata interoperability services, notably Biosharing.org
Collaborator Contribution - For the FAIRDOM project (BBSRC SysMO-DB1, SysMO-DB2, DMMCore) we use the OeRC ISA Structure as the backbone of the SEEK4Science software we use for the FAIRDOMHub Commons. Co-developing ISA2 - For ELIXIR-UK, OeRC are partners in metadata interoperability services, notably Biosharing.org
Impact see outcomes of: SysMO-DB2, DMMCore: Data and Model Management Core for ERASysAPP & Europe, and Delivering ELIXIR-UK
Start Year 2008
 
Description Research Object 
Organisation researchobject.org
Sector Charity/Non Profit 
PI Contribution researchobject.org is a grass roots community to develop and disseminate Research Objects, their concept, adoption, and other latest developments. It was established by Prof Goble's e-Science group and is now a global community with academic and commercial members.
Collaborator Contribution The community have developed specifications, implementations and run a series of international workshops.
Impact Specifications of Research Objects, including RO-Crate http://www.researchobject.org/2019-11-15-ro-crate-1-0/ funding from Elsevier, incorporation into data repositories (DataVerse, Mendeley Data) and the NIH Data Commons Core to the development of the workflow collaboratory for the EOSCLife project (European Open Science Cloud Life). A component of the EU EOSC FAIR Digital Object Framework Multidisciplinary - chiefly the life sciences, biodiversity and computer science
Start Year 2013
 
Title Apache Taverna Workflow Management System 
Description Scientific Workflow Management System, with: Workbench, Engine, embedded Player, Platform, Plug-ins, Servers, Interaction services, Provenance management. In 2014 Taverna entered the Apache Incubator programme and became Apache Taverna. Apache is the most well known and respected of the Open Source Software Foundations. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact Apache is the most well known and respected of the Open Source Software Foundations. Taverna has been extensively used in bioinformatics, biodiversity, chemistry, digital humanities, astrophysics, astronomy and health informatics. Taverna is one of the founders of the current revolution of data driven science using computational workflows. It has influenced mainstream platforms such as Galaxy and KNIME. The Common Workflow Language (commonwl.org) is based on its language. CWL is the standardised description for workflow systems interoperability in the European Open Science Cloud and in the ELIXIR Research Infrastructure. Taverna in its various forms has been cited over 3000 times in its key associated publications. 
URL https://taverna.incubator.apache.org/
 
Title LabTrove 
Description The LabTrove application enables researchers to share their experimental plans, thoughts, observations and achievements with the wider online community in a secure, semantically rich and extensible manner. The SRF encompasses this software and other associated repository systems for LIMS and environmental (eg sensor) data and can also be coupled with pervasive and non invasive approaches to capturing this data. The SRF enables scientists to follow a method much closer to the original concept of experimental science in terms of declaring a hypothesis, recording data and analysing results. Furthermore, scientists will no longer have to print out data results to insert into conventional lab books; instead, results will be logically associated with the experiment and therefore accessible as required. Thus it is possible to pivot the material and view the data in a chronological diary form for example ordata-centrically in terms of the scientific argument. 
Type Of Technology Software 
Year Produced 2011 
Impact A few hundred users, most are using LabTrove as software as a Service on hosted system, hosted by Southampton or the Royal Society of Chemistry and a couple of other places (such and University of Sydney and UNSW and now starting up at NTU). Other installations have been set up with help and a handful of stand-alone installations. 
URL http://www.labtrove.org
 
Title PROV 
Description Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. The W3C PROV standard defines a model, corresponding serializations and other supporting definitions to enable the inter-operable interchange of provenance information in heterogeneous environments such as the Web. This document provides an overview of this family of documents. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2013 
Impact The World Wide Web Consortium standard (called a recommendation in W3C jargon) for representing, recording and interchanging provenance information. 
URL http://www.w3.org/TR/prov-overview/
 
Title Research Objects 
Description Supporting the publication of more than just PDFs, making data, code, and other resources first class citizens of scholarship. Recognizing that sometimes there is a need to publish collections of these resources together as one shareable, cite-able resource. Enriching these resources and collections with any and all additional information required to make research reusable, and reproducible. The Research Object framework is a way of representing and managing the multi-variant and compound/interlinked nature of research artifacts. It is a metadata model, with a collection of conventions encoded in standards, protocols and policies, with realizations using off the self and specialist software. Research objects are any digital resource that aims to go beyond the PDF for scholarly publishing, and prescribed ways of combining those resources. Several example implementations for systems biology, public heath, scientific workflows and publications. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2010 
Impact The notion of a Research Object (RO) has had a significant impact in the publishing and scholarly communications world -- the NIH BD2K Data Commons Big Data Bags use ROs( DOI: 10.1109/BigData.2016.7840618, doi: https://doi.org/10.1101/268755) -- ROs formed the basis of the workflow reproducibility and preservation in EU project Workflow4ever (http://www.wf4ever-project.org) and EU EVER-EST ROHub -- ROs have influenced the publisher community: NPG, F1000, GigaScience, Mozilla Science, Elsevier all have RO activities -- ROs won the Vision award in the 2nd Beyond the PDF International Conference -- RO principles are incorporated in the EU RI ELIXIR interoperability workplan for computational workflows. -- ROs are used by the USA's FDA (Food and Drug Administration) BioCompute Objects (https://osf.io/h59uh/) See ResearchObject.org for more impact stories 
URL http://researchobject.org
 
Title SEEK4Science 
Description The SEEK platform is a web-based resource for sharing heterogeneous scientific research datasets,models or simulations, processes and research outcomes. It preserves associations between them, along with information about the people and organisations involved. Underpinning SEEK is the ISA infrastructure, a standard format for describing how individual experiments are aggregated into wider studies and investigations. Within SEEK, ISA has been extended and is configurable to allow the structure to be used outside of Biology. SEEK is incorporating semantic technology allowing sophisticated queries over the data, yet without getting in the way of your users. 
Type Of Technology Software 
Year Produced 2009 
Open Source License? Yes  
Impact The SEEK4Science platform was adopted by the all of the ERANet SysMO I and II projects it was designed for and has gone on to be widely adopted in other programmes, notably the German Virtual Liver Network and its follow-on Liver Systems Medicine project, ERANet's ERASysBio+ projects, and ERASysAPP. The Platform is now developed under the FAIRDOM Initiative (funded by the DMMCore project partners, including the BBSRC) http://www.fair-dom.org, were it has been rebadged as FAIRDOM-SEEK The software platform has been independently adopted by 50+ groups in Europe, Russia, South Africa, USA, and the UK. EU Projects that adopt the platform include: EmPowerPutida and Mycosynvac; national projects include the German Systems Medicine for Liver project and the de.NBI German Bioinformatics Network, and the Norway's Digital Life programme. The platform has been adopted by the Environmental Molecular Sciences Laboratory a large national scientific user facility at Pacific Northwest National Laboratory (PNNL) in Washington State, USA. Combined with the openBIS system, it is the platform for two UK Synthetic Biology Centre's data management (SynBioChem and SynthSys). 90+ projects are currently registered on FAIRDOMHub.org Commons, a centralised public community instance of the SEEK4Science Platform. Work on the SysMO-DB project and the SEEK directly lead to participation in the ESFRI Research Infrastructure ISBE Light - Infrastructure for Systems Biology Europe, and we have lead the Data and model management work package setting out Europe's plans for this area. The SEEK4Science software and associated software, FAIRDOMHub Commons and its support services form a core pillar of the ISBE Light Interim phase. Commercially, SEEK was the prototype component of Eagle Genomics Ltd's eaglecore platform, adopted by GeneXplain and is currently being reviewed by 3 commercial organisations. Practical evaluation of SEEK and openBIS for biological data management in SynthSys; first report (https://www.era.lib.ed.ac.uk/handle/1842/12236) recommended the platform. The SysMO-DB project also directly lead to the DMMCore award (renamed the FAIRDOM project), a consortium of 4 EU funding councils to: Establish a sustainable European Infrastructure to extend the network services to the wider European systems biology community; Develop the necessary toolset and set up a data and model management platform for systems biology project, building on SEEK and openBIS (SystemsX); and document and disseminate the outcomes and activities to funding agencies, projects and centres with the goal of establishing a sustainable business model for this infrastructure. FAIRDOM is funded by the UK through the BBSRC BB/M013189/1 DMMCore: Data and Model Management Core for ERASysAPP and Europe project. Wruck et al. Data management strategies for multinational large-scale systems biology projects. Briefings in Bioinformatics 2012 stated that Out of the box it provides the most useful features for large scale biology projects. 
URL http://www.seek4science.org
 
Title Taverna Workflow Management System 2.x 
Description Scientific Workflow Management System and Toolsuite including: enactment engine, workbench, plugins and plugin framework, server, commandline tool, player, interaction service. 
Type Of Technology Software 
Year Produced 2008 
Open Source License? Yes  
Impact Widespread, global use throughout research labs, universities and some commercial adoption. A daily audit reveals over a 1000 different users a day across the globe are using the Taverna Workbench to make workflows, and this does not include workflows executed through applications or portals on a server. In 2014 Taverna entered the Apache Foundation as Apache Taverna. Taverna 2.5.0 was the last non-Apache release from taverna.org.uk Taverna 2.x has More than 40000 downloads in total More than 5000 downloads of Taverna 2.5 Workbench Nearly 3000 downloads of Taverna 2.5 Command Line Tool More than 300 downloads of Taverna 2.5.4 Server 
URL http://www.taverna.org.uk
 
Title The Software Ontology 
Description The Software Ontology (SWO) is a description of software used to store, manage and analyze data. Input to the SWO has come from beyond the life sciences, but its main focus is the life sciences. We used agile techniques to gather input for the SWO and keep engagement with our users. The result is an ontology that meets the needs of a broad range of users by describing software, its information processing tasks, data inputs and outputs, data formats versions and so on. Recently, the SWO has incorporated EDAM, a vocabulary for describing data and related concepts in bioinformatics. 
Type Of Technology Software 
Year Produced 2013 
Open Source License? Yes  
Impact The SWO is currently being used to describe software used in multiple biomedical applications. 
URL http://theswo.sourceforge.net
 
Title Utopia Documents 
Description Utopia Documents v2.4 is a free PDF reader that connects the static content of scientific articles to the dynamic world of online content. Free PDF reader available for Windows, Mac and Linux. Features include: Direct link-outs from highlighted text to various data sources, scientific information, and search tools Article impact metrics e.g. altmetrics are included when available to allow readers to view article data Comments feature to allow researchers to make private comments or publicly discuss an article Export of tables into spreadsheets and 'toggle' converting numerical tables into scatter plots Optimized for life science-biomedical-biochemical scientific disciplines Relies on external services; accessed via plug-ins whose appearance in the interface is mediated by a 'semantic core' for processing and analyzing data 
Type Of Technology Software 
Year Produced 2009 
Open Source License? Yes  
Impact - In excess of 50k downloads of software since 2009; estimated over 20k current users - Creation of spinout company (Lost Island Labs) to commercially exploit Utopia Documents. Software licensed to three major biotech/pharma companies. - Acquisition of rights to one of Utopia's core algorithms by CrossRef to convert publisher's legacy PDFs into XML - Inclusion of software in Debian Linux stable release (from 2014 onwards, beginning with 'jessie' release) 
URL http://utopiadocs.com/
 
Title myExperiment 
Description Public repository for retaining and sharing scientific workflows. Social sharing platform. myExperiment makes it easy to find, use and share scientific workflows and other Research Objects, and to build communities. 
Type Of Technology Webtool/Application 
Year Produced 2008 
Open Source License? Yes  
Impact First and arguably only public sharing platform for any workflow system. Over 1000 citations (combined, google scholar) of myExperiment papers. on 12/03/2017 myExperiment has: 10633 registered members, 393 groups, 3894 workflows, 1238 files, 479 packs Used by many EU projects (e.g. BioVeL, SCAPE, HELIO, VPH, IBISBA), US (e.g. FLOSS), companies (e.g. RapidMiner, KNIME) and open platforms (Galaxy) as their workflow repository. over 22 workflow systems represented in repository. In 2018 was proposed by the EU Research Infrastructure ELIXIR as the registry for the European Open Science Cloud. in 2019 four EU projects - BioExcel2, EOSC Life, Synthesys+ and IBISBA1.0 - have pooled together to redevelop myExperiment into myExperiment 2.0 based on the FAIRDOM-SEEK (SEEK4Science) platform developed by the DMMCore and SysMO BBSRC awards. 
URL http://myexperiment.org