CerealsDB: A community resource for wheat genomics

Lead Research Organisation: Earlham Institute
Department Name: Research Faculty

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

"A community resource for wheat functional genomics" was one of the first projects to be funded by the BBR initiative. The resources enabled Bristol and Rothamsted Research to develop "CerealsDB" and "Monogram". CerealsDB and Monogram are accessed directly or via other web sites such as the Wheat Improvement Strategic Program (WISP), the International Wheat Initiative and GrainGenes. CerealsDB hosts tools and resources to access and analyse sequence and SNP-based datasets for wheat (Triticum aestivum) and related species. CerealsDB assists breeders and academics in exploring the wheat genome and selecting strategies for genotyping and marker assisted selection. CerealsDB includes a database in excess of 100,000 varietal SNPs, of which several thousand have been validated and mapped on one of three mapping populations. CerealsDB also contains information on DArT markers and Expressed Tagged Sequences (ESTs), and was the first site to host the 5x genome sequence for the variety Chinese Spring released in 2010. Since launch, CerealsDB has been updated regularly, with the last update on the 1st May 2013 with the addition of a further ~1,500 mapped SNP markers and a mapping function which assigns unmapped SNPs to putative homoeologous specific chromosome arms.

To ensure that CerealsDB remains on of the most useful wheat-based web sites and to increase the communities awareness of the resources hosted, we intend to develop CerealsDB to included further tools such as an expanded SNP/genotype database, including iSelect and Axiom based datasets along with facilities to identify and characterise functional SNPs. In addition, new releases will include graphic tools to enable breeders and academics to link mapped and non-mapped SNPs/genes across a variety of wheat related species. Finally, although CerealsDB has always been designed to be easy to use, we will provide training across the range of activities included within the web site.

Planned Impact

The BBSRC web site states that "BBSRC is the principal funder of food-related research in the UK and has food security as one of its key strategic drivers." In the UK wheat is, by a significant margin, the most valuable arable crop and therefore not surprisingly the status of the UK wheat crop is a national issue often reported in the press. While BBSRC funds a large number of wheat-related projects, for the most part, it is the translation of this research into wheat varieties with increased yield or improved qualities such as improved bread making qualities that provides the impact required by BBSRC if it is to deliver its strategic objectives. Translation of wheat research in line with BBSRC's strategic priorities, almost always involves either mapping and cloning genes or crossing alleles of interest into elite varieties. In all cases, since its launch, CerealsDB has helped researchers to achieve impact by providing wheat geneticists with unencumbered sequences and SNP-based information which is easy to access and free from IP or MTAs.

The development of CerealsDB has led to strategic changes in the approaches taken by various wheat breeders, for instance many regard the development of CerealsDB as a significant factor in their decision to convert their marker labs to the SNP-based markers.

In accordance with BBSRC strategy to achieving impact, Prof. Edwards has actively engaged with industry, for instance through the CIRC program and via his participation in a BBSRC science interchange programme with Advanta-Nickerson (now Limagrain) Seeds to exchange technologies and ideas. In addition the university base of the Bristol group and the institutional base of TGAC allows each to incorporate its research directly into undergraduate, post-graduate and post doc teaching. This approach has been remarkably successful and has resulted in numerous young researchers being placed within agricultural institutes or companies as PhD students, researchers or employees.

The facilities offered by CerealsDB are unique. We know that this because our users tell us. Although CerealsDB is international recognised as being a primary source of wheat sequences and SNP-based markers, we believe that by adding the facilities described, Bristol will ensure that it increases its international profile and continues to provide a valuable (almost) one stop shop for wheat functional genomics. In so far as we are aware no similar commercially available resources exist, however, it is important to note that our web site is used by both academic and the industrial sectors with several companies relying on it for a substantial part of their business and many of our improvements have been carried out at the suggestion of the UK wheat breeding sector.

Publications

10 25 50
 
Description We continue to develop our data resources more broadly as part of DFW. As part of the original grant remit to extend access to datasets, we set up instances of iRODS (https://irods.org/) which is an open source data management software tool to allow for easy sharing of data allowing for federation between different institutions. It also allows for metadata to be associated with the files within the datasets enable richer searching and data mining. We developed two software tools to add functionality to iRODS: eirods-dav (https://github.com/billyfish/eirods-dav), is a tool to interact with both iRODS and Apache httpd web servers (https://httpd.apache.org/), to allow for easy sharing of iRODS-hosted data from a standard web interface. This took an existing open source solution, davrods (https://github.com/UtrechtUniversity/davrods), and greatly extended it by adding a number of features. These were the ability to greatly modify the page layout to make it more user-friendly, expose the metadata for viewing, searching and editing and adding a REST API to allow for programmatic access to the underlying data and metadata. The second tool, iRODS Bash Completer (https://github.com/TGAC/irods_bash_completer), allows the iRODS client icommands to have auto-complete functionality within a Linux/Unix/Mac operating system which makes for a much more user-friendly experience for users accessing the datasets from a command line.

We created the Grassroots infrastructure which is an easily-deployable suite of computing middleware tools to help users and developers gain access to scientific data infrastructure that can easily be interconnected.

We developed the ability for services to be linked together so that the output results from one service can be parsed automatically to generate an already populated set of parameters for another service so that the user does not have to copy and paste results and can run the subsequent service with a single click or command from within a script.

A number of Grassroots web services were added to the CerealsDB website (http://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/webservices.php) which, since they conform to the Grassroots API, can interact with other Grassroots servers such as the one at Earlham Institute. This gives a REST API to allow developers to access the data hosted at CerealsDB such as SNP, contig or phenotype information programmatically.

A collaboration with scientists at the University of Bristol and the John Innes Centre resulted in a QTL genotyping service combining CerealsDB (http://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/select_QTL.php) and Grassroots web services (https://github.com/TGAC/grassroots-parental-genotype-service) and data form JIC.
Exploitation Route The Grassroots infrastructure wraps up both industry-standard software tools and our own custom developed tools with a consistent API, using as many standard schemas and ontologies as possible. As such it allows services and datasets to be distributed across different institutions with the Grassroots infrastructure taking care of connecting them together. Identical services running on different datasets in different institutions can be federated together. So, from a user's point of view, they appear as a cohesive set of tools and data available at one point of entry, with only one web service and a single set of search parameters to populate to query all of the distributed data. This means that when searching for data to use across a number of disparate web services, users do not need to go to each of these Grassroots-powered services in turn, but only visit one of them to get a consistent view over a community's datasets.

The linked service functionality means that we can build services that are linked automatically, helping users gain access to suitable tools and workflows following searches within the Grassroots infrastructure.

We have undertaken collaborations with other institutions whenever possible for the benefit of the wider academic and breeder communities, such as adding wrappers around the data stored in the SeedStor system at https://www.seedstor.ac.uk/ provided by the Germplasm Resources Unit at the John Innes Centre to provide both a map-based Grassroots service as well as a REST API for programmers to access which has already been successfully integrated into their publicly-available tools by the CerealsDB team at the University of Bristol.

These improvements to functionality and user experience will help researchers gain access to a greater breadth and depth of information related to not only CerealsDB but also related projects such as Designing Future Wheat, of which CerealsDB is a key information resource.
Sectors Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software)

URL http://www.cerealsdb.uk.net
 
Description We have continued to collaborate with the Bristol team to implement new features and functionality within the CerealsDB site, and link it to the EI Grassroots tools to fulfil the original grant objectives but also the Designing Future Wheat objectives. We have undertaken collaborations with other institutions whenever possible for the benefit of the wider academic and breeder communities, such as adding wrappers around the data stored in the SeedStor system at https://www.seedstor.ac.uk/ provided by the Germplasm Resources Unit at the John Innes Centre to provide both a map-based Grassroots service as well as a REST API for programmers to access which has already been successfully integrated into their publicly-available tools by the CerealsDB team at the University of Bristol. These and other improvements to functionality and user experience have helped researchers gain access to a greater breadth and depth of information related to not only CerealsDB but also related projects such as Designing Future Wheat, of which CerealsDB is a key information resource. As CerealsDB is an open resource, commercial scientists have access to this information for (pre-)breeding experiments, providing a route to generating impact and revenue across the translational spectrum.
First Year Of Impact 2018
Sector Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software)
Impact Types Cultural,Economic

 
Description Interview with Environment Adviser from the UK Parliamentary Office of Science and Technology
Geographic Reach National 
Policy Influence Type Implementation circular/rapid advice/letter to e.g. Ministry of Health
Impact Contacted by UK Parliament to contribute to a POSTnote (short document to advise ministers on a given topic) on genebanks and Digital Sequence Information as a result of my recent election to the DivSeek Board of Directors. I was interviewed to provide information around current international policies on DSI and how future UK involvement might be shaped around open licencing/MTAs of DSI datasets.
URL https://www.parliament.uk/postnotes
 
Title Grassroots Genomics grid infrastructure 
Description Integrative research requires extensive multi-level approaches to enrich and expose data and workflows so that informatics infrastructures can process them effectively. The Grassroots Infrastructure is developed at the Earlham Institute (EI) to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public datasets in the plant sciences. Its lightweight reusable software stack comprises: an iRODS data management layer to provide structure to unstructured filesystems, with Elasticsearch indexed metadata and Davrods exposed WebDAV APIs; interfaces to interact with local or cloud-based analysis platforms; an Apache web server layer to deliver content and provide access to public programmatic interfaces; services such as: BLAST search on multiple databases across different sites; a mapping tool showing pathogen samples with temporal and spatial data. It can be run locally or packaged in virtual containers and deployed on a variety of hardware thus representing a decentralised system, allowing information generators to retain control over their resources but allowing interconnected resources to access each other consistently. As such, Grassroots represents EI's contribution to the Wheat Initiative Wheat Information System (WheatIS) project, formalising the infrastructure as the federated UK WheatIS node involving partners from the University of Bristol, the European Bioinformatics Institute, Rothamsted Research, and the John Innes Centre. We are currently working on lightweight mechanisms to expose underlying grid architecture using WebDAV, standardised APIs such as the Breeding API (BrAPI) and schemas such as Frictionless Data and BioSchemas to enable greater interoperability with a variety of existing services, and integration with data analysis platforms such as CyVerse and Galaxy. 
Type Of Material Data handling & control 
Year Produced 2014 
Provided To Others? Yes  
Impact This infrastructure powers the handling and release of the wheat genomics data arising from EI's flagship wheat programmes, as well as aggregating previously published datasets. Currently we have a BLAST service running on top of this infrastructure, but we are currently building federation options into the platform with the iRODS data grid software. The CerealsDB project at the University of Bristol is a widely used and vital resource for the wheat community, and the Bristol group are deploying the Grassroots infrastructure to facilitate integration of the resources held there with the resources at EI. The Field Pathogenomics project (BBSRC IPA funded project BB/M025519/1) is also powered by the Grassroots platform, enabling a fast and informative web-based user interface based on data collected by the project relating to wheat yellow rust epidemiology. 
URL https://wheatis.tgac.ac.uk/grassroots/api/
 
Title The Grassroots DFW Data Portal 
Description Continually updated large datasaet repository for the DFW project. Houses a variety of key wheat and associated datasets that are either under the Toronto licence or others as apprpriate for the level of open access. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact To date, we house 24TB of wheat datasets that have been accessed by over 4000 researchers from 64 countries. 
URL https://grassroots.tools/dfw
 
Description DivSeek Partnership 
Organisation DivSeek International
Sector Learned Society 
PI Contribution I bring infrastructure expertise to this partnership, influencing and impacting policy to provide computational and training capacity to other DivSeek partners. I promote the range of infrastructure projects that are developed in my group at EI, but also solutions developed at other centres that can contribute to the DivSeek consortium. Partners are exposed to EI projects such as COPO, Grassroots (Wheat Information System, CerealsDB, marker design), CyVerse UK and Galaxy, through working group communications and meetings at international conferences such as PAG and RDA. I lead the Data Standards for Interoperable Tools working group, and we aim to collate community-suggested standards and tools, and advise the partnership and their stakeholders in best practice for delivery of sustainable and interoperable infrastructure.
Collaborator Contribution The DivSeek consortium contributes expertise and knowledge exchange in advances in crop diversity, improving our networking and understanding of challenges and potential solutions to social, structural, and biological problems. With over 66 global partners including EI, this is a powerful and highly respected group of research institutes that are working together to enable a step change in efficiency of interactions, leading to improved crop diversity research and data sharing.
Impact EI is a founding partner of DivSeek, and Dr Davey leads one of the new working groups, "Data Standards for Interoperable Tools" (http://www.divseek.org/standards/)
Start Year 2015
 
Description ELIXIR Plants Community 
Organisation ELIXIR
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution By actively being part of the ELIXIR Plants Community it has opened up the possibilities for future projects. Including offering our services and tools (COPO, CyVerse, Grassroots etc) for use in ongoing and future implementation studies.
Collaborator Contribution Other community members have incorporated our tools/services into their plans for future work
Impact ELIXIR Plant Services Roadmap 2020-2023: doi 10.7490
Start Year 2020
 
Description Wheat Information System (WheatIS) 
Organisation Cold Spring Harbor Laboratory (CSHL)
Country United States 
Sector Charity/Non Profit 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation EMBL European Bioinformatics Institute (EMBL - EBI)
Country United Kingdom 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation French National Institute of Agricultural Research
Department INRA Versailles
Country France 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation Helmholtz Association of German Research Centres
Department Helmholtz Zentrum Munchen
Country Germany 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation International Centre for Maize and Wheat Improvement (CIMMYT)
Country Mexico 
Sector Charity/Non Profit 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation Monogram Network
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation Rothamsted Research
Country United Kingdom 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation U.S. Department of Agriculture USDA
Department Agricultural Research Service
Country United States 
Sector Public 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation University of Bristol
Country United Kingdom 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation University of California, Davis
Department UC Davis College of Biological Sciences
Country United States 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation University of Western Australia
Country Australia 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Title API for SeedStor 
Description API for https://www.seedstor.ac.uk to improve the programmatic access. Used in the Grassroots Infrastructure and CerealsDB. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2018 
Impact Grassroots Infrastructure and CerealsDB can now query SeedStor programatically instead of browsing the web page. 
URL https://github.com/TGAC/grassroots-seedstor-api
 
Title EI wheat data portal 
Description A BLAST server dedicated to wheat genomes, powered by the Grassroots API and freely available to the public and hosting a large number of previously published and newly generated genomes of wheat varieties. 
Type Of Technology Webtool/Application 
Year Produced 2014 
Impact This portal has served over 7100+ page visits, and run more than 11000 BLAST jobs from users in 31 countries, allowing them early access to full wheat genomes even before publication. Notably, this software was the main route of dissemination of the TGAC v1 Chinese Spring 42 wheat genome prior to its inclusion in Ensembl Plants. 
URL https://wheatis.tgac.ac.uk/grassroots-portal/blast
 
Title EIRods-DAV 
Description Eirods-dav provides access to iRODS servers using the WebDAV protocol and has a complete REST API for accessing and manipulating metadata from within a web browser. It adds a substantial amount of functionality to the original Davrods module written by Ton Smeele and Chris Smeele, which is a bridge between the WebDAV protocol and the iRODS API. Eirods-dav leverages the Apache server implementation of the WebDAV protocol, mod_dav, for compliance with the WebDAV Class 2 standard. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact The software is now used to host the Designing Future Wheat data portal. 
URL https://opendata.earlham.ac.uk/wheat
 
Title EIRods-DAV 
Description Eirods-dav provides access to iRODS servers using the WebDAV protocol and has a complete REST API for accessing and manipulating metadata from within a web browser. It adds a substantial amount of functionality to the original Davrods module written by Ton Smeele and Chris Smeele, which is a bridge between the WebDAV protocol and the iRODS API. Eirods-dav leverages the Apache server implementation of the WebDAV protocol, mod_dav, for compliance with the WebDAV Class 2 standard. It also automatically generates and exports the datasets as Frictionless Data Packages. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact The software is used to host the Designing Future Wheat data portal. 
URL https://opendata.earlham.ac.uk/wheat
 
Title Eirods-dav 
Description Eirods-dav provides access to iRODS servers using the WebDAV protocol and exposes a REST API for accessing and manipulating metadata from within a web browser. It adds a substantial amount of functionality to the original Davrods module written by Ton Smeele and Chris Smeele, which is a bridge between the WebDAV protocol and the iRODS API. Davrods leverages the Apache server implementation of the WebDAV protocol, mod_dav, for compliance with the WebDAV Class 2 standard. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact Eirods-dav is used to allow web-based access to a selection of files and research data released to the public by the Earlham Institute such as the Triticum Aestivum assemblies. It is used by the Grassroots Infrastructure to allow access to data produced by the Designing Future Wheat project. The Eirods-dav application runs within the CyVerse UK National Capability infrastructure. 
URL https://grassroots.tools/data/
 
Title Frictionless Data for wheat 
Description A tool to automatically generate Frictionless data Packages (https://specs.frictionlessdata.io/data-package/) for data stored in an iRODS system and make it available across the web. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Allowing data scientists access to a well-supported set of APIs to allow them to create tools and workflows using the research datasets on the DFW data portal. 
URL https://opendata.earlham.ac.uk/wheat/under_license/toronto/
 
Title Grassroots API 
Description The Grassroots Infrastructure project aims to create an easily-deployable suite of computing middleware tools to help users and developers gain access to scientific data infrastructure that can easily be interconnected. With the data-generative approaches that are increasingly common in modern life science research, it is vital that the data and metadata produced by these efforts can be shared and reused. The Grassroots Infrastructure project wraps up industry-standard software tools with a consistent API that can be federated on a number of levels. This means institutions and groups can deploy a simple lightweight virtual machine, expose local data, connect up any existing data services, and federate their instance of the Grassroots with others out-of-the-box. 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact The Grassroots API powers the public BLAST service that runs at TGAC, predominantly for the currently available wheat assemblies including the recently released TGAC v1 w2rap assembly (in preparation). We have served over 4000 unique users with over 6000 BLAST jobs since November 2015. It also underpins the Field Pathgenomics project (BBSRC IPA award 2015, PI - Saunders D., TGAC/JIC fellow), a web portal that represents the detection and subsequent phenotyping and genotyping of the wheat yellow rust pathogen. The site aims to enable researchers and breeders to track rust epidemics over variety and time, allowing for a more proactive approach to wheat crop breeding and farming. Finally, we are working with the CerealsDB group at Univ. Bristol to deploy the Grassroots infrastructure alongside the CerealsDB web portal, allowing a federation of searching, datasets, analysis and dissemination of markers, genotypes and associated feature and literature information. 
URL https://wheatis.tgac.ac.uk/grassroots/api/
 
Title Grassroots BrAPI web service 
Description This is a web service that uses the Grassroots Field Trial service and adds a Breeding API (BrAPI) layer on top to allow other BrAPI-compliant software to access the field trial data. We currently have complete support for approximately a third of BrAPI classes and calls with partial support for others. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact This allows other data scientists, software developers and applications to easily access the field trial data stored in our system using a standard nomenclature and REST API. 
 
Title Grassroots Field Trial service 
Description A web-based application for submitting and searching for various aspects of field trial experimental data. 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact A web-based application for submitting and searching for field trial data. 
URL https://grassroots.tools/beta/dynamic/fieldtrial_dynamic.html?type=AllFieldTrials
 
Title Grassroots Field Trial service 
Description Continuous updating of our existing web-based application for submitting and searching for various aspects of field trial experimental data. Updates include adding images , treatment factors, research programmes and vastly expanded faceted search functionality. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact 60 studies have now been added into the system and the pace of input is increasing as, after working closely with partners in the DFW programme, they are looking to use it as their submission system for all of this year's studies whilst in the field. 
URL https://grassroots.tools/beta/dynamic/fieldtrial_dynamic.html?type=AllFieldTrials
 
Title Grassroots Field Trial service 
Description Continuous updating of our existing web-based application for submitting and searching for various aspects of field trial experimental data. Updates include adding images , treatment factors, research programmes and vastly expanded faceted search functionality. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact 112 studies have now been added into the system. 
URL https://grassroots.tools/fieldtrial/all
 
Title Grassroots Frictionless Data Tool 
Description This is a command-line tool to extract the resources within a Frictionless Data Package into a variety of formats such as Markdown, HTML, CSV, etc. It will be available for as many different platforms as possible. It uses the schemas for each resource within the Data Package to generate the reports. It has in-built support for tabular-data-resources and will download and parse any web-based schemas from the resource profiles and use these when they are specified. It will output a file for each Data Resource within the Data Package. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact It allows users to get their data as Frictionless Data Packages and export them into other formats as needed 
 
Title Grassroots Gene Trees Search service 
Description This is a search service querying and mapping clusters to genes for sequence data 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact This is used as the backend service for a user-friendly Gene Trees search service combined with BLAST searches 
 
Title Grassroots Parental Genotype service. 
Description This software stores information regarding peak markers and parental genotype information for various QTL. It is part of a collaboration between the University of Bristol, the John Innes Centre and the Earlham Institute. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact This software is used by the CerealsDB web service to give users a simple way to browse between QTL, peak marker informations and the parental genotype information. 
URL http://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/select_QTL.php
 
Title Grassroots Search service 
Description The Grassroots free-text search engine, based upon Lucene, allows us to give ranked, faceted results for various types of research data such as field trial information, research datasets, sequence data, etc. These data items are all faceted and each facet automatically weights searches for its specific fields. For example, queries that match study names get ranked higher than those that match queries in their description field instead. This is used for general searches as well as a specific faceted search applications such as the one we have for Measured Variables to denote phenotypic data. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact This has allowed users to search across all of our data within the EI Grassroots infrastructure and allowed users to get to both services and data more quickly. 
URL https://grassroots.tools/public/service/search
 
Title Grassroots core infrastrructure 
Description The Grassroots Infrastructure project aims to create an easily-deployable suite of computing middleware tools to help users and developers gain access to scientific data infrastructure that can easily be interconnected. With the data-generative approaches that are increasingly common in modern life science research, it is vital that the data and metadata produced by these efforts can be shared and reused. The Grassroots Infrastructure project wraps up industry-standard software tools with a consistent API that can be federated on a number of levels. This means institutions and groups can deploy a simple lightweight virtual machine, expose local data, connect up any existing data services, and federate their instance of the Grassroots with others out-of-the-box. The Grassroots Infrastructure uses a controlled vocabulary of JSON messages to communicate, so any server or client that can understand JSON can be used to access and connect to the platform. We provide infrastructure to ensure that the scientific data remains the important factor, and not the worry about how to build a system to expose your data. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact The Grassroots Infrastructure has allowed researchers data scientists, breeders to perform a variety of data analyses such as sequence searching using BLAST, map-based interactive searches for field trial data, QTL parental genotype mapping, as well as custom bespoke software web services utilised by third parties such as the CerealsDB team at the University of Bristol as part of systems that they have developed for users. 
URL https://grassroots.tools
 
Title Grassroots core infrastructure 
Description The Grassroots Infrastructure project aims to create an easily-deployable suite of computing middleware tools to help users and developers gain access to scientific data infrastructure that can easily be interconnected. With the data-generative approaches that are increasingly common in modern life science research, it is vital that the data and metadata produced by these efforts can be shared and reused. The Grassroots Infrastructure project wraps up industry-standard software tools with a consistent API that can be federated on a number of levels. This means institutions and groups can deploy a simple lightweight virtual machine, expose local data, connect up any existing data services, and federate their instance of the Grassroots with others out-of-the-box. The Grassroots Infrastructure uses a controlled vocabulary of JSON messages to communicate, so any server or client that can understand JSON can be used to access and connect to the platform. We provide infrastructure to ensure that the scientific data remains the important factor, and not the worry about how to build a system to expose your data. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact The Grassroots Infrastructure has allowed researchers data scientists, breeders to perform a variety of data analyses such as sequence searching using BLAST, map-based interactive searches for field trial data, QTL parental genotype mapping, as well as custom bespoke software web services utilised by third parties such as the CerealsDB team at the University of Bristol as part of systems that they have developed for users. 
URL https://grassroots.tools
 
Title Grassroots core server software 
Description The Grassroots Infrastructure project aims to create an easily-deployable suite of computing middleware tools to help users and developers gain access to scientific data infrastructure that can easily be interconnected. With the data-generative approaches that are increasingly common in modern life science research, it is vital that the data and metadata produced by these efforts can be shared and reused. The Grassroots Infrastructure project wraps up industry-standard software tools with a consistent API that can be federated on a number of levels. This means institutions and groups can deploy a simple lightweight virtual machine, expose local data, connect up any existing data services, and federate their instance of the Grassroots with others out-of-the-box. The Grassroots Infrastructure uses a controlled vocabulary of JSON messages to communicate, so any server or client that can understand JSON can be used to access and connect to the platform. We provide infrastructure to ensure that the scientific data remains the important factor, and not the worry about how to build a system to expose your data. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact The Grassroots Inftrastructure has allowed researchers data scientists, breeders to perform a variety of data analyses such as sequence searching using BLAST, map-based interactive searches for field pathogenomic data, field trial service as well as custom bespoke software web services utiliisd by third parties such as the CerealsDB team at the University of Bristol as part of systems that they have developed for users. 
URL https://grassroots.tools
 
Title Grassroots free-text search engine 
Description The Grassroots free-text search engine, based upon Lucene, allows us to give ranked, faceted results for various types of field trial data. Each facet automatically weights searches for its specific fields. For example, queries that match study names get ranked higher than those that match queries in their description field instead. This is used for general searches as well as a specific faceted search applications such as the one we have for Measured Variables to denote phenotypic data. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact This has allowed field trial data scientists to search across all of our data and allows them to search for the correct ontological terms to describe the phenotypic traits that have been measured within their trials. This has allowed researchers to be able to upload their data to our systems more quickly by allowing them to determine the correct ontological terms more easily. 
URL https://grassroots.tools/beta/public/SearchTreatment
 
Title Parental Genotype Service 
Description The Parental Genotype Service works with data from various cross-parental breeding lines with associated genotypic markers along with which parent is responsible for their presence in the child line. It can accept various queries across this data, 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact As part of a collaboration with Paul Wilkinson at the University of Bristol and Luzie Wingen at the John Innes Centre, it is used as part of a QTL web service available from the CerealsDB website. 
URL http://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/select_QTL.php
 
Title The Grassroots Infrastructure 
Description The Grassroots software is an open source "as-a-Service" stack that powers a number of data dissemination and analysis activities at EI, and other sites such as CerealsDB at the University of Bristol. We have continued to develop the functionality within the software stack to share crop-related datasets. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact Grassroots has previously been used to host the Field Pathogenomics project website and Yellow Rust map, the EI wheat BLAST service, the CerealsDB federation project, and the multi-scale improvements to the Polymarker marker design software. Recently, Grassroots has been put forward as the main data repository and metadata catalogue for the Designing Future Wheat project, and has started to host data from this project, the Open Wild Wheat Consortium, and 5 new wheat genomes from EI. The Grassroots service runs within the CyVerse UK National Capability infrastructure. 
URL https://grassroots.tools/
 
Title iRODS filename completion tool for BASH 
Description This is a tool to allow the iRODS client icommands to have auto-complete functionality within Bash as is the case with normal mounted filesystems. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact For people using the iRODS client icommands, it allows time to be saved as instead of having to type out the full paths which is error-prone and time-consuming, they can simply press the tab key to get all current matching filenames taking into account any characters that they may have already entered. 
 
Description AI for Wheat workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The AI for Wheat workshop was a meeting of approximately 50 people from academia and industry to examine ways to use AI methods and algorithms on wheat-based data.
Year(s) Of Engagement Activity 2020
 
Description Attendance at the ELIXIR-UK All Hands 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Rob Davey attended the ELIXIR-UK All Hands 2020 to take part in discussions about ongoing and possible future collaborations
Year(s) Of Engagement Activity 2020
URL https://www.earlham.ac.uk/elixir-uk-all-hands-2020
 
Description Attended and Presented a talk at the DFW All Hands 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Rob Davey attended the DFW All Hands 2020 (DFW Annual Meeting) and presented a talk on the work being carried out in WP4
Year(s) Of Engagement Activity 2020
 
Description Blog about successful award application 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog describing our initial work on adding support for the Frictionless Data standard to our hosted data. This resulted in interest and feedback from new research groups about possible future work.
Year(s) Of Engagement Activity 2020
URL https://frictionlessdata.io/blog/2020/08/17/frictionless-wheat/
 
Description Blog after completion of project 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog describing our fully-completed work on adding support for the Frictionless Data standard to our hosted data. This resulted in interest and feedback from new research groups about possible future work.
Year(s) Of Engagement Activity 2021
URL https://frictionlessdata.io/blog/2021/03/05/frictionless-data-for-wheat/
 
Description Building infrastructure for open science - British Computer Society 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited speaker at the Advanced Programming Group annual Christmas lecture
Year(s) Of Engagement Activity 2015
URL http://www.bcs.org/category/18516
 
Description CerealsDB Workshop for end users 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Academics and breeders attended to get overviews and tutorials of the tools available within CerealsDB, Grassroots and Ensembl Plants. This gave them an opportunity to discover how these tools might be useful for them as well as giving feedback to us, as the developers of these platforms, to help plan useful future work.
Year(s) Of Engagement Activity 2018
 
Description DFW Hackathon 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A workshop to discuss and implement potential collaborations to create tools to solve bioinformatic needs within the DFW community.
Year(s) Of Engagement Activity 2019
 
Description Data Stewardship in the Life Sciences 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I spoke at the "Challenges and Opportunities in Plant Science Data Management" workshop on the subject of data management in the life sciences.

Open data and integrative data sharing are fundamental factors in order to address the challenges of modern data-intensive science. There is a clear need to develop and maintain community-focussed, semantically-aware data stewardship and management platforms, such as COPO, that are able to cope with the description and sharing of potentially huge datasets arising from the life sciences. Once made available, it is not sufficient to assume that researchers around the globe have requisite skills and resources to analyse these data. Therefore, we need to provide large-scale data analysis environments that are fit for purpose, incorporating state-of-the-art interfaces and programmatic layers to meet broad end-user requirements, such as CyVerse and Galaxy. Finally, this can only happen when there are community-led efforts into implementing solutions for data standardisation, best practice, and FAIR data policy. We are now only just starting to take advantage of groundbreaking opportunities to make integrated data a reality, and thus enabling scientists to store, manage, and share their data as a first-class citizen of the scientific process.
Year(s) Of Engagement Activity 2017
URL http://app.core-apps.com/pag_2017/event/e2bec353017762d275ce250c23e011e6
 
Description Data, Data, Data Everywhere (Pint of Science talk, Norwich) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Dr Davey delivered a talk as part of the Norwich 2017 Pint of Science series about the challenges and solutions for modern data management in the life sciences, including recent data developments, high-performance computing, and software tools.
Year(s) Of Engagement Activity 2017
URL https://pintofscience.co.uk/event/crops-crystals-and-computers-technology-for-food-security
 
Description Divseek Working Group - Data Standards for Interoperable Tools 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact As part of the "DivSeek - Addressing the challenges and opportunities for information and data sharing associated with plant germplasm" session at PAG, I spoke about the DivSeek Data Standards for Interoperable Tools Working Group. This WG will promote best practice in data sharing in the plant sciences, through the use of open and interoperable software powered by the adoption of open standards, i.e. programmatic interoperability standards (APIs), controlled vocabularies, trait dictionaries, metadata standards, and ontologies. We aim to highlight gaps in interoperability that impede workflows important to the communities supported by DivSeek partners, by liaising with research development groups, other DivSeek working groups, and consortia with relevance to DivSeek. We will educate and train data generators about standards and the tools and resources that use them, in order to promote and foster standards-compliance for long-term open data stewardship.
Year(s) Of Engagement Activity 2017
URL https://pag.confex.com/pag/xxv/meetingapp.cgi/Paper/26202
 
Description Down The Tubes! Talk at the Norwich Science Festival 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Dr Davey gave a talk on the internet and data science entitled "Down The Tubes!" at the 2018 Norwich Science Festival.
Year(s) Of Engagement Activity 2018
URL https://norwichsciencefestival.co.uk/events/down-the-tubes/
 
Description Engagement with Industry - KWS UK Ltd 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Mr Bian and Miss Minotto showed EI CyVerse infrastructure and Grassroots Infrastructure's features, including data sharing to the staffs from KWS UK Ltd: Ed Byrne, Janina Dordel, Andreas Menze and Vipul Patel.
Year(s) Of Engagement Activity 2018
 
Description Frictionless Data for Wheat - CSV Conf talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Gave a talk on Frictionless Data for Wheat as part of the Grassroots Infrastructure
Year(s) Of Engagement Activity 2021
URL https://csvconf.com/2021/
 
Description Frictionless Data for Wheat blog 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog to describe our work on tools for integrating Frictionless Data into the Grassroots Infrastructure as part of our successful grant application from the Frictionless community.
Year(s) Of Engagement Activity 2021
URL https://frictionlessdata.io/blog/2021/03/05/frictionless-data-for-wheat/
 
Description Frictionless Data for Wheat talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact As part of the Frictionless Data Community Call series, we gave a talk on the Frictionless Data functionality that has been developed as part of the Grassroots Infrastructure
Year(s) Of Engagement Activity 2021
 
Description Grassroots Field Trial Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact 25 people attended a workshop on submitting data to and using the Grassroots Field Trial system
Year(s) Of Engagement Activity 2022
 
Description Grassroots Infrastructure and the Wheat Information System (Genome 10K & Genome Science 2017) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Mr Bian and Dr Tyrrell presented a poster at Genome 10K & Genome Science 2017 conference.
Year(s) Of Engagement Activity 2017
URL http://www.earlham.ac.uk/genome-10k-and-genome-science-conference
 
Description Grassroots Infrastructure and the Wheat Information System (RDA Interest Group on Agricultural Data (IGAD), Barcelona) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Dr Davey delivered a talk about the Grassroots software infrastructure for the dissemination of wheat data through federation and integration of storage and compute e-infrastructure.
Year(s) Of Engagement Activity 2017
URL https://www.rd-alliance.org/rda-interest-group-agricultural-data-igad-pre-plenary-meeting-3-4-april-...
 
Description Grassroots: An infrastructure for sharing services & data 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A talk at a conference on agricultural data to show the various applications available as part of the Grassroots Infrastructure for disseminating bioinformatics data.
Year(s) Of Engagement Activity 2019
 
Description Grassroots: Field Trials database presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Gave a talk on the Grassroots Field Trial system as part of the annual DFW all-hands meeting
Year(s) Of Engagement Activity 2021
 
Description Integrative Bioinformatics 2018 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Gave a workshop about the Grassroots Infrastructure and COPO to end users.
Year(s) Of Engagement Activity 2018
 
Description Laying the Foundations; Why are Semantics in Agriculture Difficult? - PAG 2020 talk in Plant Phenotypes workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Dr Davey gave an invited talk to approx 90 attendees at the PAG 2020 workshop "Plant Phenotypes"
Year(s) Of Engagement Activity 2020
 
Description Organiser of Challenges and Opportunities in Plant Science Data Management PAG workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Co-organiser of Challenges and Opportunities in Plant Science Data Management PAG workshop, which saw 6 international speakers deliver presentations on various aspects of data management in the plant sciences. Approx 50 attendees.
Year(s) Of Engagement Activity 2020
 
Description PhenoHarmonIS Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A talk was given about the tools and data available within the Grassroots Infrastructure and how it uses standards to describe crop-based experimental data
Year(s) Of Engagement Activity 2018
URL https://sites.google.com/a/cgxchange.org/cropontologycommunity/2018-phenoharmonis
 
Description RDA Wheat Data Interoperability Working Group meeting, RDA Plenary, Barcelona 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The Wheat Data Interoperability Working Group aims to provide a common framework for describing, representing linking and publishing Wheat data with respect to open standards.Such a framework will promote and sustain Wheat data sharing, reusability and operability. Specifying the Wheat linked data framework will come with many questions: which (minimal) metadata to describe which type of data? Which vocabularies/ontologies/formats? Which good practices? Mainly based on the the needs of the Wheat initiatiative Information System (WheatIS) in terms of functionalities and data types, the working group will identify relevant use cases in order to produce a "cookbook" on how to produce "wheat data" that are easily shareable, reusable and interoperable. This meeting saw the maturation of the Working Group into a Maintenance Group, showing that we have moved from an inception phase to an implementation phase, promoting the outputs of the WG (the Wheat Data Interoperability guidelines) to users.
Year(s) Of Engagement Activity 2016
URL https://www.rd-alliance.org/group/agricultural-data-ig-igad-wheat-data-interoperability-wg-agriseman...
 
Description Support open science and FAIRness through an integrated collaborative platform for life science: CyVerse UK and hosted services 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The Earlham Institute, an Elixir UK node, is home to CyVerse UK, a collaborative cyberinfrastructure for life science. CyVerse UK objectives align greatly with the Elixir vision, as it aims to ensure researchers have easy access to HTC resources while lowering the entry barrier to bioinformatics, thanks both to the easy of use of the platform and the trainings provided. Great focus is posed on data storage, management, and overall how to ensure FAIRness. The Cyverse Data Store and Data Commons come with attached metadata, in the latter case a bare minimum set is required. Data availability and reliable data transfer take advantage of iRODs. The CyVerse cyberinfrastructure also hosts COPO and Grassroots, which are of particular interest to the data ecosystem. COPO is a brokering service between scientists and public repositories, enabling management, aggregation and publication of research outputs. COPO eases the process of metadata attribution by presenting the same intuitive interface for different repositories, and a wizard to guide the user through the steps of adding metadata. The Grassroots Genomics project aims to facilitate consistent approaches to generating, processing and disseminating public wheat datasets so that research efforts can be translated into community valuable resources thanks to effective sharing and reuse of data. On the computational side, CyVerse UK offers a number of registered and versionised applications users can run both using an API or through the parent CyVerse US web interface. Our last report shows how researchers not only from the UK, but also from Europe, America, Africa and Asia benefited from these applications. The CyVerse UK pool also hosts a Galaxy instance reserved to collaborators at BeCA. The expansion of the infrastructure will allow us to offer on demand virtual machines to the research community to support them in development, training or with collaborative virtual laboratory.
Year(s) Of Engagement Activity 2019
 
Description Visit from Collaborator from University of Bristol; Paul Wilkinson 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Visit from Collaborator from University of Bristol; Paul Wilkinson
Year(s) Of Engagement Activity 2018
 
Description Wheat Bioinformatics III workshop, South Africa 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact 25 people from across the crop industrial and academic sectors were trained in Wheat Bioinformatics, the third in a series of workshops organised by Diane Saunders, Burkhard SteuerNagel and Rob Davey, funded by the UK High Commission in South Africa. This workshops are valuable for the attendees to learn the up-to-date computational and analytical techniques to make the most of their own and publicly available wheat data, feeding these skills directly into their breeding programmes.
Year(s) Of Engagement Activity 2020
 
Description Wheat Initiative group discussion at Plant and Animal Genome conference 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Discussion of the latest research activities from the Wheat Initiative members.
Year(s) Of Engagement Activity 2019
 
Description iRODS UGM 2021 Talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Gave a talk on our iRODS developments as part of our Grassroots Infrastructure
Year(s) Of Engagement Activity 2021
 
Description iRODS functionality within the Grassroots Infrastructure (iRODS User Group Meeting 2017, Utrecht, The Netherlands) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Dr Tyrrell presented work on the development of the eirods-dav software package for the Grassroots data dissemination platform.
Year(s) Of Engagement Activity 2017
URL https://irods.org/ugm2017/