iPlant UK
Lead Research Organisation:
University of Warwick
Department Name: Warwick Systems Biology Centre
Abstract
Biology is increasingly a 'big data' science as new high-throughput technologies support faster, cheaper generation of sequencing, metabolite and image data. This enables potentially exciting breakthroughs as researchers spot undiscovered patterns and make new discoveries of biological importance. However, many individual biologists, and in some areas the community as a whole, struggle to take full advantage of the data generated because of a lack of computing resource, appropriate support and technical skill. It is not only the output of data analyses, such as a models, curated datasets, or raw data, that have value to the wider community, but also the tools generated during research projects that are used to support researchers to test and validate their hypotheses. Currently these tools often remain in prototype form, for use only within the group or laboratory that generated them, because there is comparatively little standardisation and no easy means of sharing an accessible, user-friendly version of the tool.
To undertake world-class bioscience, researchers therefore need to be able to store and access datasets, models and analysis tools, ideally from different locations across the globe due to the need for international collaboration. The iPlant Collaborative was funded by US agency the National Science Foundation (NSF) in 2008 to help solve these issues. The iPlant Data Store is a cloud-based storage space, accessed via iPlant's Discovery Environment (DE), a virtual work/lab bench. In the DE, users can share datasets and tools to analyse data with as many or as few people as they wish. Tools to analyse data developed by iPlant staff or built by others can be shared with the wider community, in a similar manner to 'apps' on smartphones.
The iPlant Collaborative is currently distributed across three US locations; we propose to extend this into an international collaboration by building a UK iPlant node at The Genome Analysis Centre (TGAC). TGAC provides the National Capability of computational infrastructure and as such is perfectly situated to provide the foundations for the iPlant UK node. The UK iPlant node would provide independent versions of the iPlant Data Store and DE but would also be linked to the US nodes to share resources and expertise. Physical resource alone is not sufficient for a successful infrastructure: it also needs to be used, maintained and expanded as demand increases. To demonstrate the versatility, power and value of iPlant UK, a dedicated team of programmers based at the Universities of Warwick, Liverpool and Nottingham will adapt tools that have been generated for use in a single project for wider community adoption. Three suites of tools to benefit key areas of UK plant science - sequencing, systems biology and image analysis - will be made available to the global plant research community via the iPlant DE.
In less than 10 years, iPlant has built a global user base of over 18,500 users. As this continues to expand, iPlant's future sustainability must be considered. A UK iPlant node will help ensure the future existence and reliability of iPlant, spread expertise and best practice between the UK and US, allow the UK to input to the future direction of this valuable resource and provide an exemplar project to others wishing to establish future international iPlant nodes.
By establishing iPlant UK and promoting access to a resource that allows users to readily store and analyse their data, this project will help support a wide range of research including genome-wide association projects exploiting natural variation in crops, predicting biological networks and pathways, and the high-throughput imaging and image analysis services that take researchers one step closer to bridging the genotype to phenotype gap.
To undertake world-class bioscience, researchers therefore need to be able to store and access datasets, models and analysis tools, ideally from different locations across the globe due to the need for international collaboration. The iPlant Collaborative was funded by US agency the National Science Foundation (NSF) in 2008 to help solve these issues. The iPlant Data Store is a cloud-based storage space, accessed via iPlant's Discovery Environment (DE), a virtual work/lab bench. In the DE, users can share datasets and tools to analyse data with as many or as few people as they wish. Tools to analyse data developed by iPlant staff or built by others can be shared with the wider community, in a similar manner to 'apps' on smartphones.
The iPlant Collaborative is currently distributed across three US locations; we propose to extend this into an international collaboration by building a UK iPlant node at The Genome Analysis Centre (TGAC). TGAC provides the National Capability of computational infrastructure and as such is perfectly situated to provide the foundations for the iPlant UK node. The UK iPlant node would provide independent versions of the iPlant Data Store and DE but would also be linked to the US nodes to share resources and expertise. Physical resource alone is not sufficient for a successful infrastructure: it also needs to be used, maintained and expanded as demand increases. To demonstrate the versatility, power and value of iPlant UK, a dedicated team of programmers based at the Universities of Warwick, Liverpool and Nottingham will adapt tools that have been generated for use in a single project for wider community adoption. Three suites of tools to benefit key areas of UK plant science - sequencing, systems biology and image analysis - will be made available to the global plant research community via the iPlant DE.
In less than 10 years, iPlant has built a global user base of over 18,500 users. As this continues to expand, iPlant's future sustainability must be considered. A UK iPlant node will help ensure the future existence and reliability of iPlant, spread expertise and best practice between the UK and US, allow the UK to input to the future direction of this valuable resource and provide an exemplar project to others wishing to establish future international iPlant nodes.
By establishing iPlant UK and promoting access to a resource that allows users to readily store and analyse their data, this project will help support a wide range of research including genome-wide association projects exploiting natural variation in crops, predicting biological networks and pathways, and the high-throughput imaging and image analysis services that take researchers one step closer to bridging the genotype to phenotype gap.
Technical Summary
New technologies such as next generation sequencing (NGS), high-throughput phenotyping and metabolite profiling have made large data sets, several terabytes in size, a common feature of modern plant biology. However, intelligent re-use and impact of this data is not always fully realised due to a lack of data storage capacity, compute power for analysis, technical skills (which often have to be self-taught or accessed via a collaborator) and limited tool sharing within the community. The NSF-funded iPlant Collaborative aims to help mitigate these problems. It provides three core services: the Data Store, for cloud-based large data storage and retrieval; the Discovery Environment (DE), for user-friendly data analysis software; and Atmosphere, a platform allowing researchers to custom-build virtual workbenches and share these with collaborators anywhere in the world. Data analysis in the DE is achieved via apps, which are built either by iPlant developers or by users. iPlant is structured as a distributed model within the US, spreading effort, expertise and resources between the Texas Advanced Computing Center (TACC), Cold Spring Harbor Laboratory, and the University of Arizona. It was designed with extension and replication in mind, and we propose taking advantage of iPlant's federation capabilities to develop a UK iPlant node at the The Genome Analysis Centre (TGAC). To encourage uptake and demonstrate the power of iPlant services, three suites of tools in the areas of systems biology, image analysis and sequencing data, which are currently only suitable for use by a small number of experts, will be optimised for HPC and adapted for the iPlant environment, thus widening their applicability and user base. A small number of additional tools from the wider community will also be adapted for use in the iPlant Environment via an extended collaborative support programme.
Planned Impact
The principal beneficiaries from iPlant UK are research scientists in academia and industry, BBSRC and other funding bodies. The three suites of tools, covering systems biology, sequencing data management and image-based phenomics, will deliver the first applications to iPlant UK and in doing so will provide proof of concept and establish guidelines and best practice for future users who wish to share their own command line-based research tools via iPlant. This proposal will allow increased availability of BBSRC-funded tools for the global community and will help build a common international biological science platform that prevents duplication of effort and funding. In doing so, rational and supported reuse of data, applications and resources is encouraged.
As the planned community tool development to prime and troubleshoot the system is focused on plant science applications, the main initial beneficiaries will be the plant science research community, from students to senior researchers. However, many of the tools are generic and can be used with any compatible dataset from any organism. Ultimately, iPlant UK will be a community resource for all biologists: the long-term beneficiaries will be anyone working with big data.
Funding bodies will also benefit from iPlant UK. Although sharing raw data has become a standard requirement for publication in recent years, sharing tools developed for data analysis and visualisation is not typical. Where they are shared, whether through an institutional repository or a third-party open data web service such as Figshare or Dryad, their use may be limited by differences in operating systems or the expertise of new users. iPlant UK will provide the tools, guidelines and the platform for developers to share their command line-based workflows with the research community in a user-friendly way. More of the output from publicly funded UK research will therefore be accessible to the wider national and international research community.
Although there is limited opportunity for outreach directly via the personnel requested in this project, all services from the iPlant Collaborative, including the Atmosphere cloud computing platform and the DNA Subway undergraduate teaching tool, will be promoted via invited talks and guest blog posts/articles via the PIs from the iPlant UK team.
As the planned community tool development to prime and troubleshoot the system is focused on plant science applications, the main initial beneficiaries will be the plant science research community, from students to senior researchers. However, many of the tools are generic and can be used with any compatible dataset from any organism. Ultimately, iPlant UK will be a community resource for all biologists: the long-term beneficiaries will be anyone working with big data.
Funding bodies will also benefit from iPlant UK. Although sharing raw data has become a standard requirement for publication in recent years, sharing tools developed for data analysis and visualisation is not typical. Where they are shared, whether through an institutional repository or a third-party open data web service such as Figshare or Dryad, their use may be limited by differences in operating systems or the expertise of new users. iPlant UK will provide the tools, guidelines and the platform for developers to share their command line-based workflows with the research community in a user-friendly way. More of the output from publicly funded UK research will therefore be accessible to the wider national and international research community.
Although there is limited opportunity for outreach directly via the personnel requested in this project, all services from the iPlant Collaborative, including the Atmosphere cloud computing platform and the DNA Subway undergraduate teaching tool, will be promoted via invited talks and guest blog posts/articles via the PIs from the iPlant UK team.
Organisations
- University of Warwick (Lead Research Organisation)
- IBM (Collaboration)
- University of Western Australia (Collaboration)
- Monogram Network (Collaboration)
- International Centre for Maize and Wheat Improvement (CIMMYT) (Collaboration)
- Helmholtz Association of German Research Centres (Collaboration)
- French National Institute of Agricultural Research (Collaboration)
- University of California, Davis (Collaboration)
- EMBL European Bioinformatics Institute (EMBL - EBI) (Collaboration)
- U.S. Department of Agriculture USDA (Collaboration)
- Rothamsted Research (Collaboration)
- University of Szeged (Collaboration)
- Cold Spring Harbor Laboratory (CSHL) (Collaboration)
- DivSeek International (Collaboration)
- CGIAR (Collaboration)
- University of Bristol (Collaboration)
Publications
Gardiner LJ
(2016)
Mapping-by-sequencing in complex polyploid genomes using genic sequence capture: a case study to map yellow rust resistance in hexaploid wheat.
in The Plant journal : for cell and molecular biology
Pound MP
(2017)
Deep machine learning provides state-of-the-art performance in image-based plant phenotyping.
in GigaScience
Leonelli S
(2017)
Data management and best practice for plant science.
in Nature plants
Polanski K
(2018)
Bringing numerous methods for expression and promoter analysis to a public cloud computing service.
in Bioinformatics (Oxford, England)
Bhosale R
(2018)
A mechanistic framework for auxin dependent Arabidopsis root hair elongation to low external phosphate.
in Nature communications
Orosa-Puente B
(2018)
Root branching toward water involves posttranslational modification of transcription factor ARF7.
in Science (New York, N.Y.)
Giri J
(2018)
Rice auxin influx carrier OsAUX1 facilitates root hair elongation in response to low external phosphate.
in Nature communications
Banda J
(2019)
Lateral Root Formation in Arabidopsis: A Well-Ordered LRexit.
in Trends in plant science
Coulton A
(2020)
AutoCloner: automatic homologue-specific primer design for full-gene cloning in polyploids.
in BMC bioinformatics
Arnaud E
(2020)
The Ontologies Community of Practice: A CGIAR Initiative for Big Data in Agrifood Systems.
in Patterns (New York, N.Y.)
Von Wangenheim D
(2020)
Early developmental plasticity of lateral roots in response to asymmetric water availability.
in Nature plants
Shaw F
(2020)
COPO: a metadata platform for brokering FAIR data in the life sciences
in F1000Research
Pandey B
(2021)
Plant roots sense soil compaction through restricted ethylene diffusion
in Science
Description | BEIS/UKRI/RCUK Cloud Workshop, London, 24-10-2017 |
Geographic Reach | National |
Policy Influence Type | Contribution to a national consultation/review |
Description | UKRI Data Infrastructure Roadmap |
Geographic Reach | National |
Policy Influence Type | Contribution to a national consultation/review |
Description | UKRI Supercomputing Roadmap |
Geographic Reach | National |
Policy Influence Type | Contribution to a national consultation/review |
Description | 16ALERT |
Amount | £283,383 (GBP) |
Funding ID | BB/R000662/1 |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 07/2017 |
End | 08/2018 |
Description | A computational cloud framework for the study of gene families |
Amount | £181,000 (GBP) |
Funding ID | BB/N023145/1 |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 03/2017 |
End | 09/2018 |
Description | International Wheat Yield Partnership (IWYP). |
Amount | $2,000,000 (USD) |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2016 |
End | 01/2019 |
Title | CGCore v2 Improvements |
Description | As part of the collaboration between the EI COPO project and the CGIAR Big Data Platform, we worked with CGIAR and Crop Ontology developers to improve the CG Core v2 schema for describing CGIAR digital outputs. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | Globally, this work will affect all CGIAR Data Managers and users of the COPO platform to deposit data into CG Centre repositories. |
URL | https://github.com/collaborative-open-plant-omics/cgcore_schema |
Description | Computational biology for Genomics |
Organisation | IBM |
Department | IBM UK Labs Ltd |
Country | United Kingdom |
Sector | Private |
PI Contribution | We have had scoping meetings and with work with Ritesh Krishna on the project |
Collaborator Contribution | Initial sharing of expertise |
Impact | Paper https://doi.org/10.1101/2021.02.04.429826 Code https://github.com/JoshuaColmer/HallCircadian |
Start Year | 2017 |
Description | DivSeek Partnership |
Organisation | DivSeek International |
Sector | Learned Society |
PI Contribution | I bring infrastructure expertise to this partnership, influencing and impacting policy to provide computational and training capacity to other DivSeek partners. I promote the range of infrastructure projects that are developed in my group at EI, but also solutions developed at other centres that can contribute to the DivSeek consortium. Partners are exposed to EI projects such as COPO, Grassroots (Wheat Information System, CerealsDB, marker design), CyVerse UK and Galaxy, through working group communications and meetings at international conferences such as PAG and RDA. I lead the Data Standards for Interoperable Tools working group, and we aim to collate community-suggested standards and tools, and advise the partnership and their stakeholders in best practice for delivery of sustainable and interoperable infrastructure. |
Collaborator Contribution | The DivSeek consortium contributes expertise and knowledge exchange in advances in crop diversity, improving our networking and understanding of challenges and potential solutions to social, structural, and biological problems. With over 66 global partners including EI, this is a powerful and highly respected group of research institutes that are working together to enable a step change in efficiency of interactions, leading to improved crop diversity research and data sharing. |
Impact | EI is a founding partner of DivSeek, and Dr Davey leads one of the new working groups, "Data Standards for Interoperable Tools" (http://www.divseek.org/standards/) |
Start Year | 2015 |
Description | Identification of genes underlying clock mutants in Arabidopsis |
Organisation | University of Szeged |
Country | Hungary |
Sector | Academic/University |
PI Contribution | Sequence two Arabidopsis mutants and bioinformatically identified candidate genes |
Collaborator Contribution | Provided us with mutants |
Impact | none yet |
Start Year | 2016 |
Description | Integration of COPO and CGCore Schemas and Associated Repositories |
Organisation | CGIAR |
Country | France |
Sector | Charity/Non Profit |
PI Contribution | We have developed a proof-of-concept platform to streamline metadata attribution and dataset deposition into CGIAR repositories using the BBSRC-funded COPO software. Drs Etuk and Shaw, two Research Software Engineers in the Davey group at Earlham Institute and the original core developers, have implemented various new features into COPO to allow CGIAR Data Managers to harmonise and streamline the submission of CG-relevant metadata and data into the CG digital data repositories. All software and infrastructure is hosted within the CyVerse UK cloud. We have: - Implemented support of CG Core v.2.0. (http://repo.mel.cgiar.org/handle/20.500.11766/4764) metadata annotation of various data types, including publications, produced at the CGIAR institutes via the existing COPO wizard system. - Implemented support of submissions of annotated objects to institutional instances of the following repositories: dSpace (https://www.duraspace.org/dspace/), CKAN (https://ckan.org/) and Dataverse (https://dataverse.org/). - Designed and implemented a mechanism within COPO which controls which users can submit to which repositories. - Implemented support the annotation of variables within data sets (i.e. column headings; experiment condition descriptors etc) with terms and URIs from ontologies or controlled vocabularies/trait dictionaries (AGROVOC and GACS). |
Collaborator Contribution | CGIAR have provided coordination contributions with key members in the CG Centres to gather feedback on developed elements, as well as provided funds to allow a core CGCore metadata schema developer to travel to EI and work with Drs Etuk and Shaw to improve the CGCore schema. |
Impact | This collaboration has seen rapid development of key functionality in the COPO platform to support CG centre Data Managers. This has required technical skills to develop the software, biocuration expertise provided by CGIAR to improve and refine the CGCore metadata schema, ontology expertise from the Bioversity team in Montpellier, and coordination expertise from Dr Davey (EI) and Medha Devare (CGIAR). Software and Technical Products (Webtool/Application - Collaborative Open Plant Omics (COPO) (2017)): All software code developed is open source and can be found within the COPO Github repository: https://github.com/collaborative-open-plant-omics/COPO |
Start Year | 2018 |
Description | Wheat Information System (WheatIS) |
Organisation | Cold Spring Harbor Laboratory (CSHL) |
Country | United States |
Sector | Charity/Non Profit |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | EMBL European Bioinformatics Institute (EMBL - EBI) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | French National Institute of Agricultural Research |
Department | INRA Versailles |
Country | France |
Sector | Academic/University |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | Helmholtz Association of German Research Centres |
Department | Helmholtz Zentrum Munchen |
Country | Germany |
Sector | Academic/University |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | International Centre for Maize and Wheat Improvement (CIMMYT) |
Country | Mexico |
Sector | Charity/Non Profit |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | Monogram Network |
Sector | Academic/University |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | Rothamsted Research |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | U.S. Department of Agriculture USDA |
Department | Agricultural Research Service |
Country | United States |
Sector | Public |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | University of Bristol |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | University of California, Davis |
Department | UC Davis College of Biological Sciences |
Country | United States |
Sector | Academic/University |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Description | Wheat Information System (WheatIS) |
Organisation | University of Western Australia |
Country | Australia |
Sector | Academic/University |
PI Contribution | The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative. |
Collaborator Contribution | All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project. |
Impact | This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/ |
Start Year | 2011 |
Title | APPLES - Analysis of Plant Promoter-Linked Elements |
Description | The APPLES software package is a set of tools to analyse promoter sequences on a genome-wide scale. Two functionalities are provided in this version: 1. Finding Orthologs as Reciprocal Best Hits (APPLES_rbh) 2. Finding Non-Coding Conserved Regions (APPLES_conservation). |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=d99ca952-dbe2-11e6-9e37-0242ac120003 |
Title | BHC - Bayesian Hierarchical Clustering |
Description | A clustering algorithm for expression data originally made available in R, allows for the analysis of both time course or multiple static datasets |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=1e03e32e-4e87-11e6-bd1d-0242ac120003 |
Title | BWA_Alignment-_produces_sorted_+_indexed_BAM_output |
Description | Workflow - Burrows Wheeler MEM alignment into samtools BAM sorting |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | Bisque-compliant Roottrace |
Description | This is a re-implementation of the software tool Roottrace, recoded to fit into iPlant's Bisque environment. The major difficulty in this was allowing the user input required by Rootrace, which does not fit the basic Bisque model. |
Type Of Technology | Software |
Year Produced | 2017 |
Open Source License? | Yes |
Impact | Rootrace can now be made available to a wider community via iPlant. |
URL | https://github.com/Khalid-ismail/RootTrace_iPlant |
Title | Bowtie-2.2.1--Build-and-Map_for_workflows |
Description | Bowtie 2 alignment, utilised by the virus read filter aligner app |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | CSI - Causal Structure Inference |
Description | A network inference algorithm capable of inferring causal regulatory network models from time course expression data |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=12659e20-1c39-11e6-8842-0242ac120003 |
Title | Collaborative Open Plant Omics (COPO) |
Description | COPO streamlines the process of data deposition to public repositories by hiding much of the complexity of metadata capture and data management from the end-user. The ISA infrastructure (www.isa-tools.org) is leveraged to provide the interoperability between metadata formats required for seamless deposition to repositories. COPO facilitates the links to data analysis platforms such as CyVerse UK and Galaxy. Logical groupings of artefacts (e.g. PDFs, raw data, contextual supplementary information) relating to a body of work are stored in COPO collections and represented by common standards, which are publicly searchable. Bundles of multiple data objects themselves can then be deposited directly into public repositories through COPO interfaces. This improvement output represents the beta release of the COPO platform in 2017. |
Type Of Technology | Webtool/Application |
Year Produced | 2017 |
Open Source License? | Yes |
Impact | COPO has been added to the ELIXIR-UK roadmap for ELIXIR core data services, and is currently being used by EI and JIC researchers to deposit real, large scale sequencing datasets into the European Nucleotide Archive. COPO is also being investigated as a potential data entry tool for the CGIAR Big Data project, and this will be explored in a joint EAGER submission with CIMMYT. COPO has also been selected to act as one of the data ingestion pipelines for data arising from the Designing Future Wheat programme, depositing open data into the Grassroots repository. COPO is also being included in grant submissions to assist vertebrate and wheat communities in effective metadata management. COPO runs within the CyVerse UK National Capability infrastructure. |
URL | https://copo-project.org |
Title | CyVerse UK software stack deployment |
Description | The CyVerse (formerly iPlant) UK project at EI provides hardware resources in an easy to use manner through a web interface called the Discovery Environment (DE), as well as developer and bioinformatician access through APIs and software. A series of commands, called a pipeline, is combined into a script and / or a virtualised operating system container image called Docker. The pipeline can run on any hardware available to the implementer, which in this case will be the extensive HTCondor cluster set up at EI. Once a pipeline is running correctly on through the raw scheduler, the app can be registered on the Agave API (http://www.agaveapi.co). This is enabled through constructing JSON files that specify input sources together with user-supplied and default parameters that are necessary for the pipeline to run. Once a pipeline is registered through Agave, it is available as a GUI "app" through the DE, and can be made public after testing. |
Type Of Technology | Grid Application |
Year Produced | 2016 |
Impact | The EI CyVerse hardware enables the bioinformatics pipelines developed by the project partners (Univ's. Liverpool, Nottingham, Warwick) to be run on this HPC environment. Once deployed in the CyVerse UK environment, these tools can then be made available globally through the CyVerse Discovery Environment, reaching upwards of 18000 potential users. We have released this infrastructure and are accepting users from the UK research community to start using the hardware. |
URL | http://cyverseuk.org/about/cyverse-uk-projects/tgac/ |
Title | DFW cloud HPC resources |
Description | Designing Future Wheat researchers are able to request virtual machines within CyVerse UK to undertake bioinformatics analysis. |
Type Of Technology | Grid Application |
Year Produced | 2019 |
Impact | We have produced a robust and secure cloud framework within CyVerse UK to allow DFW researchers to access DFW and public data to analyse, as well as upload their own. We have already completed two successful pilot projects with external collaborators, and are now making the services available to all DFW researchers. |
URL | http://cyverseuk.org/about/collaborations/designing-future-wheat/ |
Title | Filter_Virus_Associated_Reads_From_Host_Reads |
Description | Workflow - Bowtie 2 into samtools filtering of reads |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | GP2S - Gaussian Process Two-Sample test of Differential Expression |
Description | A differential expression algorithm for time series data with a two condition (eg. control/treated) experimental design |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=655a8432-7432-11e6-a6f8-0242ac120003 |
Title | GWASSER app |
Description | GWASSER is an R based script for performing simple genome wide association using statistical modelling. |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | The GWASSER app is now available for users of the CyVerse UK platform. |
URL | http://cyverseuk.org/applications/gwasser/ |
Title | Gradient Tool |
Description | An algorithm for the identification of the time of change from single condition time course expression data |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=11d9f454-78d4-11e6-9314-0242ac120003 |
Title | HMT - Hypergeometric Motif Test |
Description | A transcription factor binding site overrepresentation analysis algorithm for known motifs |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=818d8ce0-5e4c-11e6-ac0d-0242ac120003 |
Title | Local CyVerse Discovery Environment |
Description | Full-stack deployment of CyVerse Discovery Environment |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | By having a local implementation of the DE infrastructure, we have the ability to 1. Test our software without delay caused by involving CyVerse US. 2. Have an independent platform to share our software and data. |
URL | https://cyverse.warwick.ac.uk/de/ |
Title | MEME-LaB |
Description | A transcription factor binding site overrepresentation analysis algorithm with novel motif discovery |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=b781fc48-8edd-11e6-b4ab-0242ac120003 |
Title | Mikado app |
Description | Developed at EI, Mikado is a lightweight Python3 pipeline to identify the most useful or "best" set of transcripts from multiple transcript assemblies. |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | Mikado is now a DE app, and is available to users of the CyVerse environment. |
URL | http://cyverseuk.org/applications/mikado-determine-and-select-the-best-rna-seq-prediction/ |
Title | Polymarker app |
Description | PolyMarker is an automated bioinformatics pipeline for SNP assay development which increases the probability of generating homoeologue-specific assays for polyploid wheat. |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | The Polymarker app is now available for users through the CyVerse platform. |
URL | http://cyverseuk.org/applications/polymarker/ |
Title | SAM_to_BAM_format_conversion |
Description | Converts SAM format files to BAM format files |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | Samtools_Flagstat |
Description | Analyse the quality of the alignment contained within a BAM file |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | Samtools_bamtofastq |
Description | Converts BAM files to FASTQ files |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | Samtools_bamtofastq__Version_1.2_-_with_options |
Description | Updated version of the above, with control over additional input arguments |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | Samtools_rmdup_-_remove_PCR_duplicates |
Description | Removes PCR duplicates from a BAM file |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | Samtools_sort_-_sort_BAM_file__app_for_workflows |
Description | BAM file sorter, utilised by the BWA aligner app |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | Samtools_view_-_Filter_mapped_or_unmapped_reads |
Description | Filtering of mapped or unmapped reads, utilised by the virus read filter aligner app |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | TCAP - Temporal Clustering by Affinity Propagation |
Description | A clustering algorithm for time course expression data, identifies complex regulatory groups thanks to a rich information measure |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=d874c350-ad90-11e6-a854-0242ac120003 |
Title | The Grassroots Infrastructure |
Description | The Grassroots software is an open source "as-a-Service" stack that powers a number of data dissemination and analysis activities at EI, and other sites such as CerealsDB at the University of Bristol. We have continued to develop the functionality within the software stack to share crop-related datasets. |
Type Of Technology | Webtool/Application |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | Grassroots has previously been used to host the Field Pathogenomics project website and Yellow Rust map, the EI wheat BLAST service, the CerealsDB federation project, and the multi-scale improvements to the Polymarker marker design software. Recently, Grassroots has been put forward as the main data repository and metadata catalogue for the Designing Future Wheat project, and has started to host data from this project, the Open Wild Wheat Consortium, and 5 new wheat genomes from EI. The Grassroots service runs within the CyVerse UK National Capability infrastructure. |
URL | https://grassroots.tools/ |
Title | Tuxedo_suite_PE_up_to_4_conditions |
Description | A complete Tuxedo suite workflow, going from RNA-Seq reads to differentially expressed gene lists. Utilises Tophat, Cufflinks/Cuffdiff and CummeRbund |
Type Of Technology | Software |
Year Produced | 2015 |
Impact | none |
URL | http://www.iplantcollaborative.org |
Title | Wellington Bootstrap |
Description | An algorithm for the identification of regions occupied by proteins in DNase-seq data, performing a differential analysis between two samples |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=cbf83e84-1cf1-11e6-b710-0242ac120003 |
Title | Wellington Footprint |
Description | An algorithm for the identification of regions occupied by proteins in DNase-seq data |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=035655fc-2736-11e6-ac3b-0242ac120003 |
Title | Wigwams |
Description | An algorithm for the extraction of gene groups co-regulated across subsets of multiple time course datasets |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=d5d04224-1cf8-11e6-81c4-0242ac120003 |
Title | hCSI - Hierarchical Causal Structure Inference |
Description | An expansion of CSI network inference to handle multiple time course datasets |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=ae88f3b0-1c3e-11e6-b0d6-0242ac120003 |
Title | kallisto app |
Description | kallisto is a program for quantifying abundances of transcripts from RNASeq data. |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | The kallisto app is now availabe for users of the CyVerse UK platform. |
URL | http://cyverseuk.org/applications/kallisto/ |
Title | oCSI - Orthologous Causal Structure Inference |
Description | An expansion of CSI network inference to handle data from multiple organisms |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | By publishing this tool on the CyVerse Discovery Environment, its accessibility to the research community has been greatly improved. |
URL | https://de.cyverse.org/de/?type=apps&app-id=429173d2-1c46-11e6-aaba-0242ac120003 |
Description | Building infrastructure for open science - British Computer Society |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Invited speaker at the Advanced Programming Group annual Christmas lecture |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.bcs.org/category/18516 |
Description | CyVerse UK Workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Other audiences |
Results and Impact | This meeting is focused on researchers who are either toward the beginning of their studies or have moved onto a new subject area. We will provide a hands-on sessions that will describe the use of software tools that can interrogate RNAseq, imaging, gene expression or GWAS data. Previous CyVerse users will provide real-life examples of how the software has been successfully used. This is the Learner Track In addition we will host a concurrent track for more experienced bioinformaticians who wish to learn how to use CyVerse to host their own programs. This is the Intermediate Track. The concurrent tracks will run in separate rooms. These software tools have been developed as part of the CyVerseUK grant. We will also highlight the opportunities that exist for the sharing of big data in a meaningful manner. This workshop is organised by GARNet with Professor Katherine Denby at the University of York. |
Year(s) Of Engagement Activity | 2017 |
URL | http://cyverseuk.org/events/cyverse-uk-workshop/ |
Description | CyVerse for Brassica: Performing Associative Transcriptomics by Integrating with Sequence and Phenotype Repositories |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | As part of the "CyVerse - Software, Tools, and Services for Data-Driven Discovery" workshop, Annemarie Eckes in the Davey group at EI spoke about Associative Transcriptomics (AT). AT is a method that links a physical genome, via the transcriptome, to quantitative phenotypic information. For complex polyploid crops such as Brassica napus, AT can be used to facilitate the identification of SNP markers. However, there are certain problems in performing AT for the Brassica Community: 1) this process is often data-intensive, as it commonly relies on large-scale genotypic and phenotypic raw data, and not all research groups have the computational capacity to do such analysis; 2) many groups are still dependent on the expertise of a small number of researchers who are able to generate AT data. With the help of CyVerse UK (http://cyverseuk.org), we are developing a reproducible workflow to make AT analysis available to the UK Brassica Community and beyond. The aim is to integrate phenotyping data stored in the Brassica Information Portal (BIP) (https://bip.earlham.ac.uk/) and sequence data from sequence repositories to establish an AT analysis framework, powered by tools and resources available within CyVerse. A Brassica researcher would first submit their genotypic and phenotypic raw data to the BIP and respective public repositories (e.g. the SRA/ENA sequence read archives). This ensures that their data will be stored in standardised formats and marked up with required metadata to enable reuse and subsequent comparison. With the data in place, the researcher will then be able to run AT analyses on CyVerse. We will present the current state of the project to the CyVerse user and Brassica communities in order to receive additional input and feedback. |
Year(s) Of Engagement Activity | 2017 |
URL | https://pag.confex.com/pag/xxv/meetingapp.cgi/Paper/25485 |
Description | Data Brokering for Plant Scientists (DivSeek partner's meeting, PAG 2018, San Diego) |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Delivered a lightning talk to promote the COPO data brokering platform at the annual DivSeek partner's meeting at PAG. |
Year(s) Of Engagement Activity | 2018 |
Description | Data Stewardship in the Life Sciences |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | I spoke at the "Challenges and Opportunities in Plant Science Data Management" workshop on the subject of data management in the life sciences. Open data and integrative data sharing are fundamental factors in order to address the challenges of modern data-intensive science. There is a clear need to develop and maintain community-focussed, semantically-aware data stewardship and management platforms, such as COPO, that are able to cope with the description and sharing of potentially huge datasets arising from the life sciences. Once made available, it is not sufficient to assume that researchers around the globe have requisite skills and resources to analyse these data. Therefore, we need to provide large-scale data analysis environments that are fit for purpose, incorporating state-of-the-art interfaces and programmatic layers to meet broad end-user requirements, such as CyVerse and Galaxy. Finally, this can only happen when there are community-led efforts into implementing solutions for data standardisation, best practice, and FAIR data policy. We are now only just starting to take advantage of groundbreaking opportunities to make integrated data a reality, and thus enabling scientists to store, manage, and share their data as a first-class citizen of the scientific process. |
Year(s) Of Engagement Activity | 2017 |
URL | http://app.core-apps.com/pag_2017/event/e2bec353017762d275ce250c23e011e6 |
Description | Data, Data, Data Everywhere (Pint of Science talk, Norwich) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | Dr Davey delivered a talk as part of the Norwich 2017 Pint of Science series about the challenges and solutions for modern data management in the life sciences, including recent data developments, high-performance computing, and software tools. |
Year(s) Of Engagement Activity | 2017 |
URL | https://pintofscience.co.uk/event/crops-crystals-and-computers-technology-for-food-security |
Description | Divseek Working Group - Data Standards for Interoperable Tools |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | As part of the "DivSeek - Addressing the challenges and opportunities for information and data sharing associated with plant germplasm" session at PAG, I spoke about the DivSeek Data Standards for Interoperable Tools Working Group. This WG will promote best practice in data sharing in the plant sciences, through the use of open and interoperable software powered by the adoption of open standards, i.e. programmatic interoperability standards (APIs), controlled vocabularies, trait dictionaries, metadata standards, and ontologies. We aim to highlight gaps in interoperability that impede workflows important to the communities supported by DivSeek partners, by liaising with research development groups, other DivSeek working groups, and consortia with relevance to DivSeek. We will educate and train data generators about standards and the tools and resources that use them, in order to promote and foster standards-compliance for long-term open data stewardship. |
Year(s) Of Engagement Activity | 2017 |
URL | https://pag.confex.com/pag/xxv/meetingapp.cgi/Paper/26202 |
Description | ELIXIR-UK ALL-HANDS MEETING 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | The ELIXIR-UK All Hands Meeting provided updates on recent activities from the ELIXIR UK Node and ELIXIR Hub, alongside discussions of future resources, events and roadmapping breakouts.Dr Davey presented the COPO project and CyVerse UK infrastructure as UK-specific resources that were being developed as national infrastructure for UK researchers. There was much interest from the participants in both projects, and conversations at this event led to the submission of a BBSRC TRDF with Gos Micklem (Cambridge), Dr Davey and Dr Shaw (EI). |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.elixir-europe.org/events/elixir-uk-all-hands-meeting-2017 |
Description | RDA Wheat Data Interoperability Working Group meeting, RDA Plenary, Barcelona |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The Wheat Data Interoperability Working Group aims to provide a common framework for describing, representing linking and publishing Wheat data with respect to open standards.Such a framework will promote and sustain Wheat data sharing, reusability and operability. Specifying the Wheat linked data framework will come with many questions: which (minimal) metadata to describe which type of data? Which vocabularies/ontologies/formats? Which good practices? Mainly based on the the needs of the Wheat initiatiative Information System (WheatIS) in terms of functionalities and data types, the working group will identify relevant use cases in order to produce a "cookbook" on how to produce "wheat data" that are easily shareable, reusable and interoperable. This meeting saw the maturation of the Working Group into a Maintenance Group, showing that we have moved from an inception phase to an implementation phase, promoting the outputs of the WG (the Wheat Data Interoperability guidelines) to users. |
Year(s) Of Engagement Activity | 2016 |
URL | https://www.rd-alliance.org/group/agricultural-data-ig-igad-wheat-data-interoperability-wg-agriseman... |
Description | UKRI Darwin Tree of Life Project meeting, London |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Dr Davey travelled to London with other EI staff to discuss strategy for an SPF bid to UKRI for the UK Darwin Tree of Life Project. |
Year(s) Of Engagement Activity | 2018 |
Description | iRODS functionality within the Grassroots Infrastructure (iRODS User Group Meeting 2017, Utrecht, The Netherlands) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Dr Tyrrell presented work on the development of the eirods-dav software package for the Grassroots data dissemination platform. |
Year(s) Of Engagement Activity | 2017 |
URL | https://irods.org/ugm2017/ |