COpenPlantOmics (COPO): a Collaborative Bioinformatics Plant Science Platform

Lead Research Organisation: University of Oxford

Department Name: Engineering Science

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

Accessibility to biological data has been hindered by lack of standards, lack of awareness of the benefits and pathways to releasing data that is described by those standards, and lack of services whereby data can be analysed, published and retrieved easily. Recently, there has been a large commitment by the BBSRC to push for open access data and publishing to further bioscience research in the UK. However, barriers still exist that prevent scientists from openly depositing their data and metadata, which comprise a lack of interoperability between metadata annotation services, data repositories, data analysis platforms and data publishing platforms. As such, plant scientists might not: be aware that the services exist; have the expertise to use them; see the value in properly describing their data.
This project aims to build COPO, the software infrastructure required to reach the level of interoperability that plant researchers need to describe their data using community-recognised ontologies, seamless bi-directional data flow to relevant repositories, and then publish these data for open access. COPO will manage the hardware infrastructure at TGAC to deliver a consistent robust staging area and database that will support unique accessioned artefacts representing the corpus of data and metadata a user wants to expose. The resulting marked-up datasets processed and published using COPO will allow greater potential integrative analysis using existing tools such as iPlant and Galaxy.
New Application Programming Interfaces (APIs) will interconnect existing tools and services, and by developing new RESTful user interfaces that wrap up these APIs, COPO will be a single point-of-entry for plant researchers to disseminate their data all the way from generation to publication. By federating the TGAC iRODS data grid system with others, e.g. Texas Advanced Computing Center's iPlant installation, access to worldwide analytical infrastructure and data will be facilitated.

Planned Impact

Academic, Economic and Commercial Impacts
With the renewed interest and push from all areas of bioscience to promote publicly available research, the COPO project will be a pioneering national and international effort to facilitate sharing of all aspects of plant research to the public. COPO aims to be the vehicle to bring together the tools required to harmonise open plant omics research. This sector has obvious ties with industry. Public domain omics-based bioscience is relevant and important input into industry internal research and discovery activities. To make such bioscience data truly reusable and ensure scientific robustness, it must be uniformly annotated, allowing not only integration through equivalence of terminology but also by increasing efficiency in data production and re-use, and allowing correct interpretation by means of the context provided by their metadata. A collaborative platform for frictionless bioinformatics built with and for the academic and industrial community is long overdue. Alongside data processing, industry also works on finding solutions for integration and management of large 'omics data sets, e.g. efforts like the Pistoia Alliance. Together with COPO industry partners (Eagle Genomics) we will develop use-cases for the platform in industry, propose acceptance criteria required for commercial use, supply technical advice/support on meeting acceptance criteria, evaluate the platform on 3rd party infrastructure, and maximise knowledge exchange and commercialisation.

COPO and the standards community
Expertise and knowledge gained throughout the lifetime of the project and beyond will be disseminated through a variety of channels. The presence of a direct link with the plant science community (through GARNet, UK Plant Sciences Federation (UKPSF)) is key to the success and adoption of the platform and associated standards. The project will have a continuous dialogue, through face-to-face events as well as online tools and social media, between those working on the platform and the plant bioscience community. The several letters of support show a clear interest in working together, using and adopting a platform that implicitly confers standards compliance. COPO will provide a solution to overcome the challenges in standards fragmentation by (i) fostering development, acceptance and implementation of reporting standards that are immediately suitable for plant research, and (ii) limiting the range and variability of standards. This will have a direct impact on the development and maintenance costs for commercial and academic software developers of standards-compliant products.

Societal impacts
Historically there has been reluctancy to adopt some of the standards and open-data principles in the plant bioscience community, especially in the field of food sustainability and security, so openness and transparency in these areas are vital to continue improving the public perception. The presentation of the research data will play a key role in opening the dialogue with the general public and will contribute to the development of stronger links with sectors in society (such as school teachers) that are less familiar with the scientific activities in plant research and the beneficial impact this has in their lives. It is widely recognised that the shortage of expertise and skill in biomathematics and informatics across the world is a major risks for a future development of key areas in life sciences. The objectives of this proposal will help to attract talented staff to work with the COPO partners, and offer alternative career paths.

Funded Value:

£424,036

Funded Period:

Jan 15 - Dec 18

Funder:

BBSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

BB/L024101/1

Principal Investigator:

Susanna Sansone

Research Subject:

Tools, technologies & methods (99%)

Research Topic:

eScience (99%)

Organisations

People	ORCID iD
Susanna Sansone (Principal Investigator)	http://orcid.org/0000-0001-5306-5690
Philippe Rocca-Serra (Researcher Co-Investigator)
Alejandra Noemi Gonzalez-Beltran (Researcher Co-Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 4 > >|

10 25 50

Amann RI (2019) Toward unrestricted use of public genomic data. in Science (New York, N.Y.)

Anthony Etuk (2016) COPO - bridging the gap from data to publication in plant science in F1000Research

Arnaud E (2020) The Ontologies Community of Practice: A CGIAR Initiative for Big Data in Agrifood Systems. in Patterns (New York, N.Y.)

Ashrafian H (2021) Metabolomics: The Stethoscope for the Twenty-First Century. in Medical principles and practice : international journal of the Kuwait University, Health Science Centre

Bandrowski A (2016) The Ontology for Biomedical Investigations. in PloS one

Batista D (2022) Machine actionable metadata models in Scientific Data

Charbonneau A (2022) Making Common Fund data more findable: catalyzing a data ecosystem in GigaScience

Chen X (2018) DataMed - an open source discovery index for finding biomedical datasets. in Journal of the American Medical Informatics Association : JAMIA

Cwiek-Kupczynska H (2020) Semantic concept schema of the linear mixed model of experimental observations. in Scientific data

Cwiek-Kupczynska H (2016) Measures for interoperability of phenotypic data: minimum information requirements and formatting. in Plant methods

Key Findings
Impact Summary
Policy Influence
Further Funding
Research Databases and Models
Research Tools and Methods
Collaboration
Software and Technical Products
Engagement Activities


Description	COPO is a portal for plant scientists to describe, store and retrieve data more easily, using community standards and public repositories that enable the open sharing of results. COPO is now in production, helping users through the data brokering process, as well as gathering feedback regarding improvements and bugs.
Exploitation Route	The ISA software suite, partly used by COPO, is open source and reusable for other domains outside plant science. A list of user communities is mantained here: http://www.isacommons.org/
Sectors	Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Education
URL	http://copo-project.org/


Description	The COPO infrastructure will have an impact and continue to increase the effectiveness of data sharing and the reuse.
Sector	Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software)
Impact Types	Policy & public services


Description	Advised Springer Nature on the data policy
Geographic Reach	Multiple continents/international
Policy Influence Type	Membership of a guideline committee
URL	http://www.springernature.com/gp/group/data-policy/


Description	Co-authored a review commissioned by the Wellcome Trust focusing on interoperability standards for digital research outputs
Geographic Reach	Multiple continents/international
Policy Influence Type	Membership of a guideline committee
URL	https://figshare.com/articles/Review_Interoperability_standards/4055496


Description	FAIRsharing is one of the elements mentioned in the "Framework for Discipline-specific Research Data Management" report by Science Europe.
Geographic Reach	Europe
Policy Influence Type	Influenced training of practitioners or researchers
URL	https://www.scienceeurope.org/wp-content/uploads/2018/01/SE_Guidance_Document_RDMPs.pdf


Description	FAIRsharing is one of the resources recommended by the EU EOSC "Turning FAIR into Reality" report.
Geographic Reach	Europe
Policy Influence Type	Influenced training of practitioners or researchers


Description	FAIRsharing is one of the resources recommended by the UK Jisc "FAIR in Practice report".
Geographic Reach	National
Policy Influence Type	Influenced training of practitioners or researchers


Description	FAIRsharing, FAIR Cookbook and ISA resources are core to ELIXIR data management services
Geographic Reach	Europe
Policy Influence Type	Influenced training of practitioners or researchers
URL	https://elixir-europe.org/sites/default/files/documents/annual-report-2020.pdf


Description	AgroServ
Amount	€ 15,000,000 (EUR)
Funding ID	101058020
Organisation	European Commission
Sector	Public
Country	European Union (EU)
Start	09/2022
End	08/2027


Description	EC - PHC-32-2014 - MultiMot
Amount	€ 100,000 (EUR)
Funding ID	H2020-EU.3.1, 634107
Organisation	European Commission
Department	Horizon 2020
Sector	Public
Country	European Union (EU)
Start	08/2015
End	07/2018


Description	EC H2020 - INFRADEV-3-2015 - ELIXIR EXCELERATE
Amount	€ 240,000 (EUR)
Organisation	European Commission
Department	Horizon 2020
Sector	Public
Country	European Union (EU)
Start	09/2015
End	08/2019


Description	EINFRA-2015-1 - PhenoMeNal
Amount	€ 600,000 (EUR)
Funding ID	H2020-EU.1.4.1.3, 654241
Organisation	European Commission
Department	Horizon 2020
Sector	Public
Country	European Union (EU)
Start	09/2015
End	08/2018


Description	FAIRplus
Amount	£3,996,150 (GBP)
Funding ID	802750
Organisation	European Commission
Department	Innovative Medicines Initiative (IMI)
Sector	Public
Country	Belgium
Start	01/2019
End	01/2022


Description	ISA-InterMine: accelerating and rewarding data sharing
Amount	£1,174,660 (GBP)
Funding ID	208381/A/17/Z
Organisation	Wellcome Trust
Sector	Charity/Non Profit
Country	United Kingdom
Start	08/2018
End	07/2021


Title	Continued improvements to the ISA toolkit
Description	Started in 2003 and first released in 2007, the ISA tools have been developed over time by the Oxford team and collaborators or directly contributed by partnering contributors, via the ISA Commons collaborative community. Short description of the developments and achievements of the resource over the last year: • Awarded Wellcome Trust funds (2018-2021), as collaborative project with the University of Cambridge's InterMine team to link the two resources and reward researchers for annotating and publish FAIR data; also, ISA is embedded in two ELIXIR Implementation Studies, on a Plant-focused data validation and on metabolomics. • With the uptake of ISA-Galaxy tools (https://github.com/ISA-tools/isatools-galaxy) and integration with the Galaxy Framework, ISA has struck a major milestone by showcasing how prospective data management can be done, demonstrating a full deposition workflow to Metabolights and creating training material (10.7490/f1000research.1115757.1). • Jupyter notebooks (https://github.com/ISA-tools/dtp-isa-exercises) have been developed as teaching material to showcase the use of ISA-API in various context to undergraduate and postgraduate courses on data readiness.
Type Of Material	Improvements to research infrastructure
Year Produced	2018
Provided To Others?	Yes
Impact	Community use and impact is tracked via the ISA Commons, which currently has over 40 international groups, projects, and organizations that use and contribute to the development of components of the ISA metadata tracking framework. Therefore, we can say that the ISA user base ranges from hundreds to thousands of researchers from increasingly diverse domains (ranging from -omics, cell-based research, biomedical nanotechnology, plant phenotyping, toxicology, biodiversity, metagenomics, stem cell research, system biology, neuroscience, microbial science and immunology), and goes beyond researchers, curators, others resource developers and service providers, to also include journals. For example, ISA is used by the University of Oxford' GigaScience and underpins Springer Nature's Scientific Data data journal, supporting intelligent data sharing and credit; ISA is used to describe the experiment and to provide browse and search functionality for Scientific Data's content (http://scientificdata.isa-explorer.org). The ISA framework is currently embedded in a number of UK, EC and NIH and pharma funded infrastructure and research projects; here are exemplars from the ELIXIR UK Node and other Nodes: o EMBL-EBI MetaboLights' new web-based submission relies on ISA-JSON format to build web component and on the ISA-API to validate, convert experiments represented in ISA objects. o BBSRC-funded COPO infrastructure relies on the ISA API, ISA-JSON serialization and on the ISA configurations to support plant-based experiment molecular profiling experiments; it also used the ISAconverter to deposit to the ENA database. o ELIXIR-UK Node partners, University of Birmingham and Imperial College London use ISA Galaxy Tools, ISA-API and ISA validator - as part of their work in the UK Phenome Centre - to collect data prospectively but also organise public deposition to repositories. o ELIXIR Plant Community's MIAPPE standards and BrAPI rely on availability of ISA parsers and validation tools in the context of data validation programs.
URL	http://isa-tools.org


Title	Continued improvements to the ISA toolkit and the new Datascriptor component
Description	Started in 2003 and first released in 2007, the ISA tools (http://isa-tools.org) have been developed over time by the Oxford team and collaborators or directly contributed by partnering contributors, via the ISA Commons collaborative community (https://www.isacommons.org). Key work over the last year is the development of a new component, the Datascriptor: https://datascriptor.org, as part of the Wellcome Trust award (2018-2021), a collaborative project with the University of Cambridge's InterMine team. Leveraging our experience and links with the communities, we are designing an open-source web-based tool - part of an ecosystem of existing annotation and authoring systems - to help researchers to use community standards to describe their (meta)data at the source, and capitalize on their effort to accelerate the creation of a data article. In addition major advances have been made to the ISA API also working with the ELIXIR Plant and Metabolomics communities.
Type Of Material	Improvements to research infrastructure
Year Produced	2019
Provided To Others?	Yes
Impact	Community use and impact is tracked via the ISA Commons, which currently has over 40 international groups, projects, and organizations that use and contribute to the development of components of the ISA metadata tracking framework. Therefore, we can say that the ISA user base ranges from hundreds to thousands of researchers from increasingly diverse domains (ranging from -omics, cell-based research, biomedical nanotechnology, plant phenotyping, toxicology, biodiversity, metagenomics, stem cell research, system biology, neuroscience, microbial science and immunology), and goes beyond researchers, curators, others resource developers and service providers, to also include journals. For example, ISA is used by the University of Oxford' GigaScience and underpins Springer Nature's Scientific Data data journal, supporting intelligent data sharing and credit; ISA is used to describe the experiment and to provide browse and search functionality for Scientific Data's content (http://scientificdata.isa-explorer.org). The ISA framework is currently embedded in a number of UK, EC and NIH and pharma funded infrastructure and research projects; here are exemplars from the ELIXIR UK Node and other Nodes: (i) EMBL-EBI MetaboLights' new web-based submission relies on ISA-JSON format to build web component and on the ISA-API to validate, convert experiments represented in ISA objects. (ii) BBSRC-funded COPO infrastructure relies on the ISA API, ISA-JSON serialization and on the ISA configurations to support plant-based experiment molecular profiling experiments; it also used the ISAconverter to deposit to the ENA database. (iii) ELIXIR-UK Node partners, University of Birmingham and Imperial College London use ISA Galaxy Tools, ISA-API and ISA validator - as part of their work in the UK Phenome Centre - to collect data prospectively but also organise public deposition to repositories. (iv) ELIXIR Plant Community's MIAPPE standards and BrAPI rely on availability of ISA parsers and validation tools in the context of data validation programs.
URL	https://datascriptor.org


Title	Continued improvements to the ISA toolkit: the new graphql interface and RDF representation of ISA.
Description	The open source ISA framework and tools help to manage an increasingly diverse set of life science, environmental and biomedical experiments that employing one or a combination of technologies. Started in 2003 and first released in 2007, the ISA tools (http://isa-tools.org) have been developed over time by the Oxford team and collaborators or directly contributed by partnering contributors, via the ISA Commons collaborative community (https://www.isacommons.org Key work over the last year is the development of two new components, graphql interface to query ISA documents and a RDF representation of ISA in obo, sdo or wikidata (as part of the Wellcome Trust award, 2018-2021), a collaborative project with the University of Cambridge's InterMine team.
Type Of Material	Improvements to research infrastructure
Year Produced	2019
Provided To Others?	Yes
Impact	Community use and impact is tracked via the ISA Commons, which currently has over 50 international groups, projects, and organizations that use and contribute to the development of components of the ISA metadata tracking framework. Therefore, we can say that the ISA user base ranges from hundreds to thousands of researchers from increasingly diverse domains (ranging from -omics, cell-based research, biomedical nanotechnology, plant phenotyping, toxicology, biodiversity, metagenomics, stem cell research, system biology, neuroscience, microbial science and immunology), and goes beyond researchers, curators, others resource developers and service providers, to also include journals. The ISA framework is currently embedded in a number of UK, EC and NIH and pharma funded infrastructure and research projects; here are exemplars from the ELIXIR UK Node and other Nodes: (i) EMBL-EBI MetaboLights' new web-based submission relies on ISA-JSON format to build web component and on the ISA-API to validate, convert experiments represented in ISA objects. (ii) BBSRC-funded COPO infrastructure relies on the ISA API, ISA-JSON serialization and on the ISA configurations to support plant-based experiment molecular profiling experiments; it also used the ISAconverter to deposit to the ENA database. (iii) ELIXIR-UK Node partners, University of Birmingham and Imperial College London use ISA Galaxy Tools, ISA-API and ISA validator - as part of their work in the UK Phenome Centre - to collect data prospectively but also organise public deposition to repositories. (iv) ELIXIR Plant Community's MIAPPE standards and BrAPI rely on availability of ISA parsers and validation tools in the context of data validation programs.
URL	https://github.com/ISA-tools/isa-api


Title	ISA Toolkit new API
Description	ISA-API v0.14.2 is released, with the new features and fixes: graphql, json-ld/rdf, sql, IO optimization
Type Of Material	Improvements to research infrastructure
Year Produced	2023
Provided To Others?	Yes
Impact	Better use of the ISA tools by orther developers.
URL	https://github.com/ISA-tools/isa-api/releases/tag/v0.14.2


Title	MIAPPE specification and tools
Description	Minimum Information About a Plant Phenotyping Experiment is an open, community driven project to harmonize data from plant phenotyping experiments. MIAPPE specification comprises both a conceptual checklist of metadata required to adequately describe a plant phenotyping experiment, and software to validate, store and disseminate MIAPPE-compliant data.
Type Of Material	Improvements to research infrastructure
Year Produced	2017
Provided To Others?	Yes
Impact	MIAPPE is a logical standard - but there are specific implementations of tools designed to support its use and application, for example, in the ISA-tools framework. We are working with the developers of the Plant Breeding API (BRAPI) to ensure the compliance of BRAPI with the MIAPPE standard, and to coordinate future developments.
URL	http://www.miappe.org/


Title	Supporting data for "ISA API: An open platform for interoperable life science experimental metadata"
Description	The Investigation/Study/Assay (ISA) Metadata Framework is an established and widely used set of open-source community specifications and software tools for enabling discovery, exchange and publication of metadata from experiments in the life sciences. The original ISA software suite provided a set of user-facing Java tools for creating and manipulating the information structured in ISA-Tab - a now widely used tabular format. To make the ISA framework more accessible to machines and enable programmatic manipulation of experiment metadata, a JSON serialization ISA-JSON was developed. In this work, we present the ISA API, a Python library for the creation, editing, parsing, and validating of ISA-Tab and ISA-JSON formats by using a common data model engineered as Python object classes. We describe the ISA API feature set, early adopters and its growing user community. The ISA API provides users with rich programmatic metadata handling functionality to support automation, a common interface and an interoperable medium between the two ISA formats, as well as with other life science data formats required for depositing data in public databases.
Type Of Material	Database/Collection of data
Year Produced	2021
Provided To Others?	Yes
Impact	Community use and impact is tracked via the ISA Commons, which currently has over 50 international groups, projects, and organizations that use and contribute to the development of components of the ISA metadata tracking framework. Therefore, we can say that the ISA user base ranges from hundreds to thousands of researchers from increasingly diverse domains (ranging from -omics, cell-based research, biomedical nanotechnology, plant phenotyping, toxicology, biodiversity, metagenomics, stem cell research, system biology, neuroscience, microbial science and immunology), and goes beyond researchers, curators, others resource developers and service providers, to also include journals. The ISA framework is currently embedded in a number of UK, EC and NIH and pharma funded infrastructure and research projects; here are exemplars from the ELIXIR UK Node and other Nodes: (i) EMBL-EBI MetaboLights' new web-based submission relies on ISA-JSON format to build web component and on the ISA-API to validate, convert experiments represented in ISA objects. (ii) BBSRC-funded COPO infrastructure relies on the ISA API, ISA-JSON serialization and on the ISA configurations to support plant-based experiment molecular profiling experiments; it also used the ISAconverter to deposit to the ENA database. (iii) ELIXIR-UK Node partners, University of Birmingham and Imperial College London use ISA Galaxy Tools, ISA-API and ISA validator - as part of their work in the UK Phenome Centre - to collect data prospectively but also organise public deposition to repositories. (iv) ELIXIR Plant Community's MIAPPE standards and BrAPI rely on availability of ISA parsers and validation tools in the context of data validation programs.
URL	http://gigadb.org/dataset/100907


Description	ELIXIR Interoperability Platform and FAIRsharing
Organisation	ELIXIR
Country	United Kingdom
Sector	Charity/Non Profit
PI Contribution	Run by Prof. Sansone group, FAIRsharing (https://fairsharing.org) is a resource on standards, repositories, and data policies endorsed by a growing number of stakeholder communities, including major publishers, funders, libraries and FAIR-supporting organizations. FAIRsharing is part of the ELIXIR Recommended Interoperability Resources (RIRs) to facilitate interoperability and reusability of life science data and support the principles of FAIR data management.
Collaborator Contribution	The ELIXIR Recommended Interoperability Resources have been selected by external panel of reviewers, based on the selection criteria published in the Call for RIR application, which measure how they facilitate scientific research and how they improve FAIRness of life science data.
Impact	FAIRsharing is and will continue to be used by and further linked to other ELIXIR registries and services.
Start Year	2018


Description	ELIXIR Interoperability Platform and ISA
Organisation	ELIXIR
Country	United Kingdom
Sector	Charity/Non Profit
PI Contribution	ISA is part of the ELIXIR Recommended Interoperability Resources (RIRs) to facilitate interoperability and reusability of life science data and support the principles of FAIR data management.
Collaborator Contribution	The ELIXIR Recommended Interoperability Resources have been selected by external panel of reviewers, based on the selection criteria published in the Call for RIR application, which measure how they facilitate scientific research and how they improve FAIRness of life science data.
Impact	ISA is and will continue to be used by and further developed with ELIXIR communities, especially with Plant and Metabolomics use cases.
Start Year	2018


Description	ELIXIR Metabolomics Community
Organisation	ELIXIR
Department	ELIXIR UK
Country	United Kingdom
Sector	Charity/Non Profit
PI Contribution	My team has contributed ISA-related work to the ELIXIR Metabolomics use case, activities and reports.
Collaborator Contribution	We have gained more visibility for the ISA work and now ISA-Tab is a formal format used by the Galaxy analysis toolkit for metabolomics applications.
Impact	The ISA framework as the basis for the metadata standards used by this ELIXIR Metabolomics Community and the tools are embedded in the EBI MetaboLights databases, as well as in other international metabolomics resources.
Start Year	2017


Description	ELIXIR Metabolomics Community
Organisation	ELIXIR
Country	United Kingdom
Sector	Charity/Non Profit
PI Contribution	My team has contributed ISA-related work to the ELIXIR Metabolomics use case, activities and reports.
Collaborator Contribution	We have gained more visibility for the ISA work and now ISA-Tab is a formal format used by the Galaxy analysis toolkit for metabolomics applications.
Impact	The ISA framework as the basis for the metadata standards used by this ELIXIR Metabolomics Community and the tools are embedded in the EBI MetaboLights databases, as well as in other international metabolomics resources.
Start Year	2017


Description	ELIXIR Plant Use Case
Organisation	ELIXIR
Department	ELIXIR UK
Country	United Kingdom
Sector	Charity/Non Profit
PI Contribution	My team has contributed ISA-related work to the ELIXIR Plant Science use case, work and report.
Collaborator Contribution	We have gained more visibility for the ISA work and COPO activities.
Impact	ISA is used by the BRAPI and there is an ISA implementation of the MIAPPE specification.
Start Year	2016


Description	ELIXIR Plant Use Case
Organisation	ELIXIR
Country	United Kingdom
Sector	Charity/Non Profit
PI Contribution	My team has contributed ISA-related work to the ELIXIR Plant Science use case, work and report.
Collaborator Contribution	We have gained more visibility for the ISA work and COPO activities.
Impact	ISA is used by the BRAPI and there is an ISA implementation of the MIAPPE specification.
Start Year	2016


Description	ELIXIR UK Node
Organisation	Earlham Institute
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	Heriot-Watt University
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	Imperial College London
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	Newcastle University
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	Rothamsted Research
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University College London
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University of Birmingham
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University of Cambridge
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University of Dundee
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University of Edinburgh
Department	Edinburgh Genomics
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University of Edinburgh
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University of Liverpool
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University of Manchester
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	ELIXIR UK Node
Organisation	University of Oxford
Country	United Kingdom
Sector	Academic/University
PI Contribution	Help create the ELIXIR UK Node
Collaborator Contribution	Contribute to the creation of the ELIXIR UK Node
Impact	Creation of a virtual entity that represents UK strengths in bioinformatics and provides a route for UK bioinformatics resources to participate in, and benefit from, ELIXIR. The Node is currently being formalized.
Start Year	2012


Description	Hanna Cwiek - 2 month visit : MIAPPE and ISA
Organisation	Polish Academy of Sciences
Department	Institute of Plant Genetics
Country	Poland
Sector	Academic/University
PI Contribution	Members of my team, namely Philippe Rocca-Serra and Alejandra Gonzalez-Beltran has assisted Hanna in her ISA-related work.
Collaborator Contribution	Dr Hanna Cwiek from the Poznan Institute of Genetic Research in Poland (in Pawel Krajewski's team) visited my team to work on ISA and MIAPPE, helping to refine ISA tools relevant to plant science and COPO activities.
Impact	Possible paper on the work done
Start Year	2017


Description	ISA Commons
Organisation	ISA Commons
Sector	Charity/Non Profit
PI Contribution	We have helped many users, service providers and other developers to implement one or more components of the ISA software suite at their site to fit their data needs.
Collaborator Contribution	They have helped us to refine the ISA software suite, filling gaps and tuning it for certain data types.
Impact	The ISA Commons is a growing ecosystem of institute-based (e.g. USA NASA GeneLab Data Repository) and global repositories (e.g. EMBL-EBI MetaboLights), as well as data-driven journals (e.g. Springer Nature Scientific Data) that use the ISA formats, and/or are powered by one or more component of the ISA software suite. But also grass-root standards groups that leverage on the ISA data model and formats. The sustainability and maintenance of the ISA data model, formats, and tools, is guided by the ISA Working Group.
Start Year	2010


Description	Integration of COPO and CGCore Schemas and Associated Repositories
Organisation	CGIAR
Country	France
Sector	Charity/Non Profit
PI Contribution	We have developed a proof-of-concept platform to streamline metadata attribution and dataset deposition into CGIAR repositories using the BBSRC-funded COPO software. Drs Etuk and Shaw, two Research Software Engineers in the Davey group at Earlham Institute and the original core developers, have implemented various new features into COPO to allow CGIAR Data Managers to harmonise and streamline the submission of CG-relevant metadata and data into the CG digital data repositories. All software and infrastructure is hosted within the CyVerse UK cloud. We have: - Implemented support of CG Core v.2.0. (http://repo.mel.cgiar.org/handle/20.500.11766/4764) metadata annotation of various data types, including publications, produced at the CGIAR institutes via the existing COPO wizard system. - Implemented support of submissions of annotated objects to institutional instances of the following repositories: dSpace (https://www.duraspace.org/dspace/), CKAN (https://ckan.org/) and Dataverse (https://dataverse.org/). - Designed and implemented a mechanism within COPO which controls which users can submit to which repositories. - Implemented support the annotation of variables within data sets (i.e. column headings; experiment condition descriptors etc) with terms and URIs from ontologies or controlled vocabularies/trait dictionaries (AGROVOC and GACS).
Collaborator Contribution	CGIAR have provided coordination contributions with key members in the CG Centres to gather feedback on developed elements, as well as provided funds to allow a core CGCore metadata schema developer to travel to EI and work with Drs Etuk and Shaw to improve the CGCore schema.
Impact	This collaboration has seen rapid development of key functionality in the COPO platform to support CG centre Data Managers. This has required technical skills to develop the software, biocuration expertise provided by CGIAR to improve and refine the CGCore metadata schema, ontology expertise from the Bioversity team in Montpellier, and coordination expertise from Dr Davey (EI) and Medha Devare (CGIAR). Software and Technical Products (Webtool/Application - Collaborative Open Plant Omics (COPO) (2017)): All software code developed is open source and can be found within the COPO Github repository: https://github.com/collaborative-open-plant-omics/COPO
Start Year	2018


Title	Collaborative Open Plant Omics (COPO)
Description	COPO streamlines the process of data deposition to public repositories by hiding much of the complexity of metadata capture and data management from the end-user. The ISA infrastructure (www.isa-tools.org) is leveraged to provide the interoperability between metadata formats required for seamless deposition to repositories. COPO facilitates the links to data analysis platforms such as CyVerse UK and Galaxy. Logical groupings of artefacts (e.g. PDFs, raw data, contextual supplementary information) relating to a body of work are stored in COPO collections and represented by common standards, which are publicly searchable. Bundles of multiple data objects themselves can then be deposited directly into public repositories through COPO interfaces. This improvement output represents the beta release of the COPO platform in 2017.
Type Of Technology	Webtool/Application
Year Produced	2017
Open Source License?	Yes
Impact	COPO has been added to the ELIXIR-UK roadmap for ELIXIR core data services, and is currently being used by EI and JIC researchers to deposit real, large scale sequencing datasets into the European Nucleotide Archive. COPO is also being investigated as a potential data entry tool for the CGIAR Big Data project, and this will be explored in a joint EAGER submission with CIMMYT. COPO has also been selected to act as one of the data ingestion pipelines for data arising from the Designing Future Wheat programme, depositing open data into the Grassroots repository. COPO is also being included in grant submissions to assist vertebrate and wheat communities in effective metadata management. COPO runs within the CyVerse UK National Capability infrastructure.
URL	https://copo-project.org


Title	Datascriptor
Description	From structured dataset to data article. Leveraging our experience and links with the communities, we are now designing an open-source web-based tool - part of an ecosystem of existing annotation and authoring systems - to help researchers to use community standards to describe their (meta)data at the source, and capitalize on their effort to accelerate the creation of a data article. The user will be guided to provide (semi)structured descriptions of the experimental design, and of the post-processed data, to generate, respectively, the Methods and a set of statements to populate the Results section of a manuscript. Datascriptor will work: (i) as a stand-alone tool - for anyone to use - implementing generic metadata models, such as W3C Data Catalog vocabulary; and (ii) as a component of the ISA Tools - for its user communities - implementing the ISA metadata model. To output short sentences from the (semi)structured input, we will evaluate a mixed data-to-text approach using template-based and neural-based (i.e. machine learning) methods. To further enrich the content of the manuscript, Datascriptor will connect to existing authoring systems, including Substance, Texture, Stenci.la and Manuscripts, and export the result in JATS format. Our plans also include an export as a DAR file and in LaTeX format.
Type Of Technology	Webtool/Application
Year Produced	2019
Open Source License?	Yes
Impact	Work has just started, but to ensure continued impact in the stakeholder community, the Datascriptor User Advisory Board includes a core group of existing collaborators: Thomas Lemberger (EMBO Press), Scott Edmunds (GigaScience), Holly Murray ( F1000), Varsha Khodiyar (Springer Nature).


Title	ISA API
Description	Released under the Common Public Attribution License Version 1.0 (CPAL) license, the Investigation Study Assay (ISA) API aims to provide developers with with a set of tools to enable the programmatic construction of ISA objects, validation of objects, and conversion between serialisations of ISA-formatted datasets and other formats/schemas (e.g. data deposition schemas). To facilitate the use of the ISA model (see the ISA-Tab specification - http://www.isa-tools.org/format/specification/) in modern web applications, the model (version 1.0) is represented as a set of JSON schemas, which provide the information the ISA model maintains for each of the objects. JSON is a widely used interchange format that powers much of the web today, and is used by a range of programming languages and platforms. As such, the objective of designing and developing JSON schemas is to support a new serialisation of the ISA-Tab model in JSON format, in addition to existing serialisations in Tabular format and RDF format. The new JSON models can be found here: https://github.com/ISA-tools/isa-api/tree/master/isatools/schemas/isa_model_version_1_0_schemas/core
Type Of Technology	Software
Year Produced	2015
Open Source License?	Yes
Impact	The ISA API is used in a number of projects arising in collaboration with the Oxford eResearch Centre (OERC), notably the COPO project, and is under continued development.
URL	https://github.com/ISA-tools/isa-api


Title	ISA Model and Serialization
Description	The original ISA-Tab specification was published as a Release Candidate document in 2008, documenting the initial work that forms the ISA framework, with a further update in 2009. Since then, we have done work on a new serialization in JSON, ISA-JSON, and abstracted out the data model from both the tabular and JSON formats.
Type Of Technology	Software
Year Produced	2016
Open Source License?	Yes
Impact	Serialisations implemented by several ISA components; the documentation also helps other users to implement ISA formats.
URL	http://isa-tools.org/2016/10/release-of-the-isa-specs/


Title	ISA Python API
Description	The ISA API aims to provide software developers with a set of tools to help you easily and quickly build your own ISA objects, validate, and convert between serializations of ISA-formatted datasets and other formats/schemas (e.g. SRA schemas). The ISA API is published on PyPI as the isatools package.
Type Of Technology	Software
Year Produced	2017
Open Source License?	Yes
Impact	The vision for the ISA API is to provide a programming library that will become the core for all software tooling that supports the ISA framework. It enables the import of various data formats into an implementation of the ISA Abstract Model as Python objects, and export of ISA content from Python objects back to different serialization formats.
URL	http://isa-tools.org/2017/01/isa-api-milestone/


Title	ISA tooling for the metabolomics community
Description	A new set of ISA software tools have been developed out of the EU H2020 PhenoMeNal: Large-Scale Computing for Medical Metabolomics project (http://phenomenal-h2020.eu/home). The ISA team has been contributing to the project since 2015, and has been collaborating on the development of user-facing, cloud-based data management and processing infrastructure in the project. The PhenoMeNal software includes a new set of ISA-related Galaxy workflow tools, as well as native support for the ISA-Tab format in Galaxy.
Type Of Technology	Software
Year Produced	2018
Open Source License?	Yes
Impact	The tools work with the EBI MetaboLights database as well as with ISA-Tab studies uploaded directly into the Galaxy platform, and builds on the Python ISA-API. The Metabolights' use of ISA-API: Python-based REST service relying on the ISA-API https://github.com/EBI-Metabolights/MtblsWS-Py
URL	http://isa-tools.org/2018/03/isa-galaxy-developed-for-metabolomics/


Title	ISA-API Python library
Description	Project name: ISA-API Project home page: http://github.com/ISA-tools/isa-api Operating system(s): Platform independent Programming language: Python 3 Other requirements: None License: CPAL-1.0 ISA-API, a Python library that supports the creation, editing, parsing, and validatiation of both ISA-Tab and ISA-JSON formats, using a common data model implemented as native Python objects.
Type Of Technology	Software
Year Produced	2018
Open Source License?	Yes
Impact	This provides users with a common interface and interoperable medium between the two ISA formats, as well as conversion to a set of other formats required for depositing data in public databases.


Description	1st COPO user workshop
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Study participants or study members
Results and Impact	The Collaborative Open Plant Omics (COPO) consortium workshop brought together a focus group, comprising a small number of experts for 2 days, with an active interest in collecting and managing plant data. During the workshop, we discussed approaches to the description, collection, annotation, standardisation and management of (large) datasets, including requirements for submission to public repositories, current user needs and stumbling blocks. The workshop enabled us to better understand the needs of end users and to generate an overview of how, and what types of datasets, plant biologists are currently generating. This information has helped to guide the COPO consortium as it develops its community platform for data publication and citation.
Year(s) Of Engagement Activity	2015
URL	http://blog.garnetcommunity.org.uk/copo-2015-meeting/


Description	2nd COPO User Workshop
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	The Collaborative Open Plant Omics (COPO) consortium workshop brought together a focus group, comprising a small number of experts for 2 days, with an active interest in collecting and managing plant data. During the workshop, we demonstrated the new COPO portal and metadata collection layers of the software, discussed approaches to the description, collection, annotation, standardisation and management of (large) datasets, including requirements for submission to public repositories, current user needs and stumbling blocks. The workshop enabled us to better understand the needs of end users and to deliver feedback to the COPO partners about gaps and recommended software features. This information has helped to guide the COPO consortium as it develops its community platform for data publication and citation.
Year(s) Of Engagement Activity	2016
URL	http://copo-project.org/agenda_workshop2.html


Description	Biohackathon; ELIXIR, Paris
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	The team participated to several tracks, especially working on ISA for plant and metabolomics community, as well as for use in Galaxy, and the bioschema work. The work carried our continue to embed ISA and FAIRsharing into ELIXIR-driven infrastructure and activities.
Year(s) Of Engagement Activity	2018
URL	https://www.elixir-europe.org/events/biohackathon-2018-paris


Description	COPO 3rd User Workshop
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	The EI COPO team organised and ran the 3rd COPO User Workshop as a dedicated training event hosted as a satellite event to the Plant and Animal Genome (PAG) conference, in January 2018. We hired conference facilities at the nearby Mariott hotel, and ran a successful 3rd workshop to show recent developments to the platform and to gather feedback about potential improvements to 15 international participants.
Year(s) Of Engagement Activity	2018


Description	CUDDEL closing workshop/hackathon, EBI
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Closing workshop of the CUDDEL grant, following up on issues outstanding from the 2017 Hong Kong workshop; discussion to explore the feasibility of making a follow up BBSRC Partnering application in the future.
Year(s) Of Engagement Activity	2018
URL	https://github.com/ISA-tools/cuddel-mzml2isa-enhance


Description	Datascriptor hackathon - eLife Innovation Sprint
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Hackathon on the Datascriptor prototype, part of the ISA toolkit. Datascriptor aims to taking the pain out of beginning to write papers, making it easy to automatically generate the parts of a paper that can be easily scaffolded and incentivising reproducible papers by ensuring the scaffolds include well-structured data and metadata. During the online event the prototype was fleshed out by user testing with hands-on use cases.
Year(s) Of Engagement Activity	2020
URL	https://sprint.elifesciences.org/data-paper-skeleton-tools-for-life-sciences/


Description	ELIXIR-UK ALL-HANDS MEETING 2017
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	The ELIXIR-UK All Hands Meeting provided updates on recent activities from the ELIXIR UK Node and ELIXIR Hub, alongside discussions of future resources, events and roadmapping breakouts.Dr Davey presented the COPO project and CyVerse UK infrastructure as UK-specific resources that were being developed as national infrastructure for UK researchers. There was much interest from the participants in both projects, and conversations at this event led to the submission of a BBSRC TRDF with Gos Micklem (Cambridge), Dr Davey and Dr Shaw (EI).
Year(s) Of Engagement Activity	2017
URL	https://www.elixir-europe.org/events/elixir-uk-all-hands-meeting-2017


Description	ELIXIR-UK All Hands meeting
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Showcasing latest work on FAIRsharing and and presentation by Dr. Rocca-Serra of the FAIR Cookbook, as well as discussing how to best connect with other UK resources and those from other Nodes.
Year(s) Of Engagement Activity	2021
URL	https://elixir-europe.org/events/elixir-all-hands-2021


Description	ELIXIR-UK AllHands meeting, Birmingham
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Showcasing latest work on FAIRsharing and ISA, as well as discussing how to best connect with other UK resources and those from other Nodes.
Year(s) Of Engagement Activity	2018
URL	https://elixiruknode.org/event/elixir-uk-all-hands-2018/


Description	ISA presentation to GARnet workshop
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	David Johnson - in my team - gave a presentation on "Data Infrastructures to Foster Data Reuse" at a workshop on Integrating Large Data into Plant Science: From Big Data to Discovery hosted by GARnet (the UK network for Arabidopsis researchers) and Egenis (the Exeter Centre for the Study of the Life Sciences). The workshop was held at Dartington Hall in Devon, South West England, and was well attended by researchers from the plant and biological science community worldwide as well as representatives from industry from organisations such as Syngenta.
Year(s) Of Engagement Activity	2016
URL	http://isa-tools.org/2016/07/plant-science-takes-a-focus-on-isa/


Description	NERC DataTree
Form Of Engagement Activity	A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Undergraduate students
Results and Impact	Video to introduce the basic concepts of the FAIR principles, FAIR data management and FAIRsharing. The target audience for Data Tree is NERC funded PhD students and early career researchers, however, Data Tree will be an openly available resource.
Year(s) Of Engagement Activity	2017
URL	https://datatree.org.uk/


Description	Poster presentation: ISAcreate and Galaxy; Galaxy conference, Portland
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	ISA-Tab format is now used by Galaxy tools; the discussion helped ensuring the uptake continue
Year(s) Of Engagement Activity	2018
URL	https://gccbosc2018.sched.com/event/FEWs/g26-isacreate-a-galaxy-tool-for-prospective-data-management...


Description	The ELIXIR Plant Use Case - BRAPI meeting
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Ensure the use of ISA formats in the BRAPI API, which is part of the ELIXIR Plant Use Case, and that will connect plant -related ELIXIR Node repositories. This will benefit the ISA-compliant COPO infrastructure, which is also part of the ELIXIR UK Node.
Year(s) Of Engagement Activity	2017
URL	https://www.elixir-europe.org/use-cases/plant-sciences