COpenPlantOmics (COPO): a Collaborative Bioinformatics Plant Science Platform

Lead Research Organisation: University of Warwick
Department Name: School of Life Sciences

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

Accessibility to biological data has been hindered by lack of standards, lack of awareness of the benefits and pathways to releasing data that is described by those standards, and lack of services whereby data can be analysed, published and retrieved easily. Recently, there has been a large commitment by the BBSRC to push for open access data and publishing to further bioscience research in the UK. However, barriers still exist that prevent scientists from openly depositing their data and metadata, which comprise a lack of interoperability between metadata annotation services, data repositories, data analysis platforms and data publishing platforms. As such, plant scientists might not: be aware that the services exist; have the expertise to use them; see the value in properly describing their data.
This project aims to build COPO, the software infrastructure required to reach the level of interoperability that plant researchers need to describe their data using community-recognised ontologies, seamless bi-directional data flow to relevant repositories, and then publish these data for open access. COPO will manage the hardware infrastructure at TGAC to deliver a consistent robust staging area and database that will support unique accessioned artefacts representing the corpus of data and metadata a user wants to expose. The resulting marked-up datasets processed and published using COPO will allow greater potential integrative analysis using existing tools such as iPlant and Galaxy.
New Application Programming Interfaces (APIs) will interconnect existing tools and services, and by developing new RESTful user interfaces that wrap up these APIs, COPO will be a single point-of-entry for plant researchers to disseminate their data all the way from generation to publication. By federating the TGAC iRODS data grid system with others, e.g. Texas Advanced Computing Center's iPlant installation, access to worldwide analytical infrastructure and data will be facilitated.

Planned Impact

Academic, Economic and Commercial Impacts
With the renewed interest and push from all areas of bioscience to promote publicly available research, the COPO project will be a pioneering national and international effort to facilitate sharing of all aspects of plant research to the public. COPO aims to be the vehicle to bring together the tools required to harmonise open plant omics research. This sector has obvious ties with industry. Public domain omics-based bioscience is relevant and important input into industry internal research and discovery activities. To make such bioscience data truly reusable and ensure scientific robustness, it must be uniformly annotated, allowing not only integration through equivalence of terminology but also by increasing efficiency in data production and re-use, and allowing correct interpretation by means of the context provided by their metadata. A collaborative platform for frictionless bioinformatics built with and for the academic and industrial community is long overdue. Alongside data processing, industry also works on finding solutions for integration and management of large 'omics data sets, e.g. efforts like the Pistoia Alliance. Together with COPO industry partners (Eagle Genomics) we will develop use-cases for the platform in industry, propose acceptance criteria required for commercial use, supply technical advice/support on meeting acceptance criteria, evaluate the platform on 3rd party infrastructure, and maximise knowledge exchange and commercialisation.

COPO and the standards community
Expertise and knowledge gained throughout the lifetime of the project and beyond will be disseminated through a variety of channels. The presence of a direct link with the plant science community (through GARNet, UK Plant Sciences Federation (UKPSF)) is key to the success and adoption of the platform and associated standards. The project will have a continuous dialogue, through face-to-face events as well as online tools and social media, between those working on the platform and the plant bioscience community. The several letters of support show a clear interest in working together, using and adopting a platform that implicitly confers standards compliance. COPO will provide a solution to overcome the challenges in standards fragmentation by (i) fostering development, acceptance and implementation of reporting standards that are immediately suitable for plant research, and (ii) limiting the range and variability of standards. This will have a direct impact on the development and maintenance costs for commercial and academic software developers of standards-compliant products.

Societal impacts
Historically there has been reluctancy to adopt some of the standards and open-data principles in the plant bioscience community, especially in the field of food sustainability and security, so openness and transparency in these areas are vital to continue improving the public perception. The presentation of the research data will play a key role in opening the dialogue with the general public and will contribute to the development of stronger links with sectors in society (such as school teachers) that are less familiar with the scientific activities in plant research and the beneficial impact this has in their lives. It is widely recognised that the shortage of expertise and skill in biomathematics and informatics across the world is a major risks for a future development of key areas in life sciences. The objectives of this proposal will help to attract talented staff to work with the COPO partners, and offer alternative career paths.

Publications

10 25 50
 
Description Co-organised plant user workshops to test the COPO platform with data provided by UK plant researchers (via GARNet and UKPSF) working with genomics, phenomic, proteomic and metabolic datasets. These workshops helped to validate the COPO platform, highlight bugs, provide training in the platform and generate additional user requirements and suggestions for the the extension and use of the COPO platform - Contributing to Objectives 1, 2, and 4.

Promotion of the COPO platform within the international DivSeek community that is focused on increasing the access, re-use and annotation of data associated with plant germplam. Activities included inclusion and presentation of COPO at DivSeek events (such as those held at PAG), involvement of COPO in DivSeek working groups. These activities helped to link COPO with relevant international data standards and annotation projects, involvement with relevant DivSeek Initiative working groups on data compatibility and generated opportunities to train people in the use of the platform. Contributing to Objectives 1,2 and 4

Active collaboration with the Research Data Alliance (RDA) specifically the Agriculture Data Group. Activities include attendance at RDA meetings and conference to promote awareness of COPO, to understand the wider 'environment' in which COPO sits and help to ensure it interacts with and is aware of appropriate initiatives with in the RDA activities. Contributing to Objectives 1,2 and 4

Facilitated interaction with the plant researchers and data managers at the international CGIAR Centres. Activities include presentations and attendance at workshops and events such as PhenoharmonIS and PAG. These activities helped to widen the test user group beyond the UK, created opportunities to train people in the use of the platform and provided data sets beyond the initial scope of COPO project generating ideas for future development and collaboration. Contributing to Objectives 1,2 and 4

Taken together activities undertaken by York/Warwick
provide users and datasets to test and validate the COPO platform
increased awareness of the COPO project at the national and international level
brought international activities involved in data management and annotation to the attention of the COPO development team
aligned the COPO platform with relevant international efforts in meta data, data standards and data management
helped to ensure that data that submitted, annotated and described via COPO is compatible and useful to wide range of users and accessible and reusable in the future
Exploitation Route Help to make the COPO platform appropriate for plant scientists and to increase awareness of this platform.
Sectors Agriculture, Food and Drink

 
Description COPO Users Workshop Norwich 13-15 December 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact COPO Users Workshop Norwich 13-15 December - In collaboration with the Earlham Institute help to organise a workshop for a set of users to experience and provide feedback to the current platform and discuss future developments and user needs.
Year(s) Of Engagement Activity 2016
 
Description Egenis/GARNet Meeting Exeter (20-22 April) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Egenis/GARNet Meeting Exeter (20-22 April) - Meeting looked at the issues of depositing, accessing and reusing data, why people do and don't engage with the process. Lessons learned from the workshop were brought back to the COPO development team to help inform the platform design process.
Year(s) Of Engagement Activity 2016
 
Description Phenoharmonis Meeting Montpellier (8-12 May 2016) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Phenoharmonis Meeting Montpellier (8-12 May 2016) - Presented COPO along with Rob Davey. Took part in discussion of data standards for phenotype and trait data data in crops with researcher from across the globe to help better understand the data types COPO has to deal with and raise awareness of COPO platform.
Year(s) Of Engagement Activity 2016
 
Description Plant and Animal Genomes Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Gave a talk at the Plant and Animal Genomes conference in a Systems Genomics workshop.
Year(s) Of Engagement Activity 2016
 
Description Plant and Animal Genomes Conference - Data Standards Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Helped run a workshop organised by BBSRC/NSF/Era-CAPS on data standards and integration. Helping write a white paper from this workshop and future funding bids.
Year(s) Of Engagement Activity 2016
 
Description Plant phenotyping workshop at PAG 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Ruth Bastow ran a plant phenotyping workshop at PAG 2016 to initiate discussion between BBSRC, NSF and USDA (NIFA) on challenges in plant phenotyping field on data collection, curation and sharing.
Year(s) Of Engagement Activity 2016