Computational Developments

Lead Research Organisation: Earlham Institute
Department Name: UNLISTED

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

State-of-the-art technologies are generating unprecedented amounts of complex data, from genomes, to proteomes and transcriptomes, thus spanning mechanistic and functional diversity. Handling, interpreting and integrating these large scale data into descriptive models that interpret the molecular functions at a system level requires continued development of algorithms, robust computational models, and interoperable analytical frameworks. Supported by our core capability, In this work package, we will contribute to the newest developments in the data sciences and facilitate the extrapolation of meaningful signals from often noisy data.

We will continue to develop efficient, reproducible and robust assembly and scaffolding algorithms, and robust statistical models to handle diverse complex genomic and metagenomics datasets. We will improve and further develop software to facilitate orthology assignment across complex and highly divergent species. We will also incorporate machine learning approaches to integrate the conservation of functional signals across species to infer functionality and improve annotation. We are also developing new statistical and network analytical approaches will be applied to track temporal and spatial changes across and within species to further inform phenotypic complexity.

Our algorithm optimisation expertise will enable us to drive computational advances in accuracy and efficiency across our research into assembly and variant calling, annotation and, network analysis. These efforts will put the platforms in place to consistently collect and rapidly feed datasets into downstream integrative analyses, enabling the extensive and complex data interrogation processes required for bringing together multiple heterogeneous datasets. We will also apply these strategies to investigate and implement low power consumption computing technologies for data acquisition and analysis that will be deployed in environmental situations at a previously unavailable scale.

We will carry out fundamental research into software engineering methods to manage, share, visualise and integrate the large and complex datasets. We will develop research data management and dissemination layers, underpinned by community standards, that provide the granularity and searchability of EI’s large-scale and diverse data outputs that we are generating, and integrate the statistical, machine-learning, and network-based models developed under this programme. We will also build semantic knowledge graphs annotated with ontology-based descriptions in order to represent the body of information gathered through harmonised data and network integration. These will comprise metadata that describe reusable interconnected research datasets, and we will feed these methods into appropriate information systems to enable national and international collaborative research through open multi-omics platforms.

This will lay the foundations for an interconnected network of hardware and software to deliver real-time monitoring of crop development and pathogen detection and screening. Subsequent federation of these activities with national and international services will be facilitated through the EI NCG in e-Infrastructure (NC3), and interactions with the ELIXIR-UK community for pan-European life science infrastructures. This will also underpin the computational biology needs of the QIB Microbes in the Food Chain ISP in standardised and interoperable analysis systems.

Planned Impact

unavailable

Organisations

Publications

10 25 50
 
Title COPO rebranding and iconing 
Description New icon designed by Sasha Stanbridge, and applied to COPO website and branding to modernise look 
Type Of Art Artwork 
Year Produced 2023 
Impact New look COPO 
 
Title Stickers for COPO brans 
Description 250 stickers with the new COPO logo to be distributed at conferences 
Type Of Art Artwork 
Year Produced 2023 
Impact wider knowledge of the COPO brand 
 
Description As part of Work Package 3 of the Earlham Institute Core Strategic Programme (CSP) Genomics for Food Security, our objective 3.1 set in 2017 was to develop computational approaches for data integration. For the 6th year of rollover funding, our extended objective, Ext 3.1 "Advancing computational approaches for data integration and algorithm optimisation", was based on the outputs of objective 3.1 and aimed to develop this further. We have now delivered all objectives contributing to 161 publications, initiating 29 collaborations, obtaining 39 funded research applications and being involved in 221 engagement activities.

Our objectives for this work package were to research and advance computational approaches for data integration. As a result, we have successfully developed, deployed and implemented standards and tools for the collection, storage, analysis and sharing of data produced through the EI CSP and associated projects. We have met our objectives, and have produced a wide-ranging suite of methods and implementations that have led to extensive downstream impact at an international scale.

The key findings are:

1. We have successfully extended our work on the EI infrastructure and data management platforms to be able to deliver our objectives and to see EI data and data from associated projects to be submitted into the public domain with rich metadata. The Earlham Institute platform Collaborative OPen Omics (COPO) has supported development to and implemented web-based user interfaces for the brokering of large scale sample metadata and genomic sequencing data into the public archives at the ENA.
We have continued to develop COPO within the Tree of Life family of programmes, i.e. DToL, ASG and the EBP (10.12688/wellcomeopenres.17605.1; wellcomeopenresearch.org/articles/7-279/v1). In 2022 COPO went live as the central metadata management platform for the European Reference Genome Atlas (ERGA) pilot project, a pan-EU project to sequence all eukaryotic life in Europe. Through a recent Research Data Alliance Open Call fund with ERGA collaborators and the ENRICH Hub, COPO received a small amount of funding to implement Biocultural Labels and Traditional Knowledge identifiers to metadata https://localcontexts.org/labels/biocultural-labels/). These efforts led to EI, as part of ELIXIR, being included as a partner institution in two submissions to the infrastructure call for Horizon Europe 2020. These projects have been funded so will see COPO continue to be developed and supported throughout these ambitious multi-national biodiversity and agricultural programmes.
During the rollover period (Ext.3.1), we have continued to develop COPO within the Tree of Life family of programmes, i.e. DToL, ERGA, ASG and the EBP. The success of the COPO platform has initiated important discussions with the wider biodiversity and single cell communities. As a result, we have joined the Plant Cell Atlas (PCA) Data Management working group to facilitate discussion of standards and tools for single cell data management with this international consortium. It has been agreed to work with the PCA group as much as possible to develop a set of standards which will be generic enough for many single cell researchers to use, while still maintaining enough specificity to actually capture useful FAIR metadata. This enables more involved interactions with key communities to ensure single cell community standards can be developed further and used within COPO in the Cellular Genomics Institute Strategic Programme.
We contributed to key publications in the area of data management, ML/AI and policy (f1000research.com/articles/10-324/v1; 10.12688/wellcomeopenres.17605.1; wellcomeopenresearch.org/articles/7-279/v1; 10.12688/f1000research.73825.2 10.1093/gigascience/giab060; 10.1007/978-3-031-13276-6_6), as well as surveys on the concepts of reproducibility (10.3389/frma.2021.678554).

By March 2023 COPO had brokered over 27022 samples for DTOL, over 4013 for ASG and 575 for BGE.

2. We continued to develop the GeneSeqToFamily workflow. All data have been made available, but most importantly searchable, to the community on the Designing Future Wheat Grassroots platform, demonstrating strong alignment and co-development of key tools and datasets between the programmes. This includes the possibility to retrieve gene families, associated trees and alignments from a given gene name or BLAST search. This work has been carried out in close collaboration with the EI e-infrastructure National Capability. In collaboration with researchers at Rothamsted Research, we also made all our data available to the community through KnetMiner (https://knetminer.com/). We scaled GeneSeqToFamily to be able to handle very large collections of genes. We applied these developments to the data arising from the wheat pan-transcriptome project that includes over 1.4 million CDS across 10 wheat accessions, identifying 161,344 gene families. The updated pipeline has been made available to the community via the Institute GitHub repository and the Galaxy ToolShed.
Our work on applying long read RNA sequencing demonstrated an underappreciated diversity of alternatively spliced transcripts, with many including novel splice sites. To enable the detection of splice sites from the genome sequence only, we developed an attention based deep learning sequence model allowing us to not only predict splice sites but also identify features associated. We compared our approach to other state-of-the art methods for splice detection and demonstrated the accuracy and performance of our model (10.48550/arXiv.2311.12884).
Simon Tyrrell and Robert Davey developed a new service as part of the Grassroots Infrastructure to store the outputs of the pipeline including the orthogroups, alignments, trees available and searchable via a server-based API , which is used by our custom user-facing web portal. We have further extended the gene families analysis through the addition of tetraploid and diploid species to characterise the impact of whole genome duplication on gene families dynamics.

3. We have developed and benchmarked a new Machine Learning model to more reliably extract and characterise ontology terms from natural language / free text in literature, trained on a gold standard corpus of human phenotypes.

4. Amplicon sequencing is a well established and cost efficient approach to profile microbiomes. We developed a new version of the amplicon sequencing pipeline, LotuS2. This updated and user-friendly pipeline is not only faster and more accurate than other software, but also includes the option to analyse long-read amplicon sequencing (10.1186/s40168-022-01365-1). To enable users with limited bioinformatics knowledge, we made our software available to the community on Galaxy and BioConda, and provided detailed documentation (http://lotus2.earlham.ac.uk/) hosted on our National Capability CyVerse UK platform. The pipeline is being adopted, with >17,000 installations to date (as reported on BioConda, Lotus2 :: Anaconda.org).

5. We have produced a new set of algorithms and pipelines for haplotype-specific assembly and karyotype-linkage analyses, using hybrid short-and-long read sequencing. These new tools (https://github.com/bioinfologics/sdg-assemblers) are tailored to conduct de novo full-karyotype analyses on complex karyotypes. We applied them to create haplotype-resolved assemblies of the diatom Fragilariopsis cylindrus (CCMP1102), which has a complex genome with high levels of heterozygosity and repeats (10.1101/2022.07.14.500034).

6. Through analysis in WP2 and further development in WP3, we improved our pipelines to reconstruct regulatory networks that predict regulators for co-extant and ancestral co-expression modules along a phylogeny (10.1101/2021.12.14.472604).

7. The Leggett group has been working to apply adaptive sampling, a method of software-controlled enrichment for nanopore sequencing, to increase the sequencing data generated for low abundance species in metagenomic samples. Part of this work included developing a mathematical model of enrichment which can predict enrichment potential (10.1186/s13059-021-02582-x).

8. We previously developed and applied experimental protocols and computational pipelines for real time monitoring of bacterial communities and rapid pathogen detection in preterm infants (Leggett et al. PMID: 31844297). In collaboration with Lindsay Hall (Technical University of Munich (TUM)/Quadram Institute Bioscience (QIB)), we are working to apply the same approaches in adult liver disease patients suffering from gut microbiome related complications. These projects led to the development of MARTi (Metagenomic Analysis in Real Time), which is the first open source, real-time metagenomics analysis platform (https://marti.cyverseuk.org/). During 2022 MARTi had its first external release. This has been initially low-key while we prepare a publication and work to ensure compatibility beyond EI's compute infrastructure. In April, we taught use of MARTi as part of EI's "Nanopore metagenomics: from sample to analysis" training course and in October as part of EBI's "Genome-resolved metagenomics bioinformatics" course. Alongside these courses, we have described MARTi in a number of talks at conferences and meetings. These activities have resulted in external users which have influenced our development and bug fixing activities. We have also continued to develop additional functionality for sample comparison and Antimicrobial Resistance functional analysis.

9. In order to support in situ deployment of nanopore sequencing, the Leggett group entered into a collaboration with Miroculus, suppliers of a digital microfluidic platform. We have successfully developed an automated protocol that moves from bead beaten lysis solution through to nanopore sequencing library and have deployed this in-field using battery power. We have also begun to port nanopore bioinformatics pipelines to the Nvidia Jetson embedded platform, which we will use for in-field analysis.

10. We developed a new sequencing and bioinformatics approach known as Reverse Metagenomics (RevMet) for classifying mixed eukaryotic samples. Though the method is applicable to many use cases, we demonstrated its potential through a study of pollinator preferences. Nanopore sequencing of individual bee pollen baskets was combined with cheap short read skims of plants in the pollinator neighbourhood. We showed that RevMet provided reliable semi-quantitative classification of the plants visited by individual bees (10.1111/2041-210X.13265). This led to a CASE PhD studentship in which RevMet is currently being applied to understand the foraging of commercial bumblebees in fruit farms (paper in draft). We were invited to contribute to a review on pollen metagenomics methods (10.1111/mec.16689).
11. The Hildebrand group published a new version of the shotgun metagenomics pipeline MATAFILER. It includes automatic capabilities to reconstruct and interlink microbial genomes: a) gene catalogues with their b) de novo reconstructed species' core genomes c) resolved at strain level (10.1016/j.chom.2021.05.008). The metagenomic work and opinion of the group was summarised in a perspective and review papers (10.1128/mSystems; 00881-21, 10.1016/j.csbj.2020.06.028). This work is continued as part of the new EI's Institute Strategic Programme "Decoding Biodiversity".
12. The Macaulay group continues to focus on method development for single-cell multi-omic profiling, including the measurement of alternative splicing in single cells. In close collaboration with Pacbio, Anita Scoones, a PhD student in the group, developed novel sequencing library protocols to dramatically increase throughput in single-cell Iso-Seq experiments. We have now undertaken initial testing of these approaches and achieved an approximately 10-fold reduction in sequencing costs. Computational analysis of this data has been undertaken by Core Bioinformatics (Swarbreck group) and David Wright (Macaulay/Haerty groups). PhD student Silvia Ogbeide has produced an extensive multi-omic (G&T-seq) dataset of colorectal cancer organoids and is developing computational approaches to link genomic variation with transcriptomic heterogeneity. She was invited to present this work to the US FDA at their single-cell multi-omics workshop (March 2023) following our review (doi.org/10.1016/j.tig.2022.03.015). With DTP student Yash Bancil we have advanced our capabilities in single bacterial genome sequencing beyond the work published in 2022 (10.1099/mgen.0.000871). Our developments are taken forward and form a basis of the new EI's Institute Strategic programme "Cellular Genomics".
13. The Quince group successfully ran an experiment with the model colon reactors at QIB to test the impact of a dietary perturbation on a controlled community of eight species. DNA was extracted and sequenced and data sets uploaded to ENA. Samples were also prepared for transcriptomics and NMR metabolomics and that data is now also available. We have begun an integrated 'omics analysis of these data with the aim of resolving the functional changes in the synthetic communities in response to the perturbation. We have continued to analyse our in vitro paired metagenome and metabolome data sets from real human faecal samples of individuals given therapeutic diets for Crohn's disease. We have begun evaluating tools for the integrated analysis of metabolome and metagenome data as part of our CASE studentship with IBM. This has revealed microbial pathways that may be important for response.

14. In collaboration with researchers at the University of Oxford, we developed a deep learning approach to predict splice sites from the human genome sequence. The model shows good accuracy (0.92, and 0.94), precision (0.91 and 0.93) and recall (0.91 and 0.93) for the 5' and 3' splice sites respectively. We compared our approach to other state-of-the art methods for splice detection and demonstrated the accuracy and performance of our model (10.48550/arXiv.2311.12884). This work led to a MRC Better Methods and Better Research grant application, following the feedback from the panel the work will be resubmitted to the MRC as a Research Grant (Population and Systems Medicine Board, May 2024).

15. The Haerty group extended the development of Arboretum to include miRNA expression and miRNA binding sites in the reconstruction of regulatory networks (10.1093/molbev/msac146).
Exploitation Route Building on our track record in developing data brokering platforms for large-scale projects, we enable large collaborative consortia listed below to share and use data efficiently by providing bioinformatics services for data and software distribution, data sharing, and cloud computing.

The outcomes of the CSP3 are taken forward through collaborations and partnerships, such as:

- Enabling collaborative data sharing via COPO platform. With huge amounts of data there is great promise that we can use it to tackle many of our global challenges. However, unless that data is put in context and made FAIR, that data becomes increasingly unusable for future scientific study. Data is almost meaningless without metadata - the essential information about where, how and why the data has been collected. Metadata are typically recorded in scientific papers, or lab notebooks, and are rarely shared in a way that's reusable. Methods for effectively recording and managing metadata have historically been lacking. Earlham Institute's Collaborative OPen Omics (COPO) is providing an easy-to-use platform to ensure that data is easily searchable, reusable, and properly attributed (10.12688/f1000research.23889.1).
A big data broker for life science, COPO has been in use within the Darwin Tree of Life Project, which aims to sequence the DNA of more than 60,000 species from across the British Isles, producing a significant number of genome notes (wellcomeopenresearch.org/gateways/treeoflife). We have published this work in two articles focused on the metadata standards and the implementation of those standards in software (10.12688/wellcomeopenres.18499.2; 10.12688/wellcomeopenres.17605.1), as well as overarching recommendation papers for large scale biodiversity sequencing programmes (10.1073/pnas.2115639118).
COPO went live as the central metadata management platform for the European Reference Genome Atlas (ERGA) pilot project, a pan-EU project to sequence all eukaryotic life in Europe. Due to its success, COPO now handles all the sample metadata for the subsequent Horizon Europe project BGE (Biodiversity Genomics Europe). To date COPO has handled over 1468 samples from the ERGA pilot community while offering support and feedback about the metadata standardisation. Through a recent Research Data Alliance Open Call fund with ERGA collaborators and the ENRICH Hub, COPO received a small amount of funding to implement Biocultural Labels and Traditional Knowledge identifiers to metadata (https://localcontexts.org/labels/biocultural-labels/). This work resulted in improvements to the COPO software to manage recognition of indigenous knowledge, with a paper in preparation for Molecular Ecology Resources.
These efforts led to EI, as part of ELIXIR, being included as an affiliated entity institution in two Horizon Europe infrastructure grants, AgroSERV and BGE. Both projects have been funded by the EU leading to the continuation of COPO's development.
By April 2023 COPO had brokered over 21555 samples for DTOL, over 4013 for ASG, XXX for the ERGA pilot, and 575 for BGE.

- MARTi has already attracted users in academia and industry, despite not yet being published. It is being used both to study airborne pathogen microbiomes and to host analysis as part of the Delivering Sustainable Wheat programme. We have used MARTi to analyse gut microbiome samples from patients with liver disease in a collaboration with Lindsay Hall (TUM/QIB) and Vish Patel (Kings College Hospital). MARTi is also being used for analysing clinical metagenomic samples as part of a new CASE studentship with Jon Lartey (Norfolk and Norwich University Hospital) and Oxford Nanopore Technologies which studies infection mediated preterm birth.
GeneSeqToFamily workflow - we have worked with Rothamsted Research to have the results of our analyses integrated in KnetMiner their gene discovery platform

- CyVerse UK is in the process of installing CyVerse UK Discovery Environment in collaboration with CyVerse US. CyVerse is collaborating on Migration of CerealsDB (as part of the DSW) with Mark Winfield from University of Bristol. There are also plans to set up a Galaxy instance in CyVerse, and a 'federated data exchange' in collaboration with Rothamsted Research. CyVerse is also in the process of developing a collaboration with the University of Aberdeen.

- The Grassroots team presented at the Designing Future Wheat in Practice course held at the John Innes Centre on 17th November 2022, for 20-30 people, in a mixture of students, academics and industry. Grassroots has an ongoing relationship with Rothamsted Research, who will be using the Grassroots Field Trials application, installed on one of their servers there that they will use and run. This has the potential to lead to a new position as a Developer at Rothamsted Research to work on the application. A second Grassroots infrastructure has also been discussed in a collaboration with EI, Rothamsted Research and the John Innes Centre (JIC) to allow data to be shared seamlessly between the three institutes.

Tools and resources developed by the CSP3 were shared and discussed during our annual stakeholder engagement event EI Innovate which has run since November 2019. EI Innovate provides an insight into the Earlham Institute's research, exploring opportunities for innovation and collaboration. We had between 70 and 200 attendees coming to these events year on year (virtual during pandemic and face-to-face before and after the pandemic). The audience is a mixture of internal staff, other research organisations on the Norwich Research Park and external organisations. The external audience come from a range of sectors including agri-food, biotech, med-tech sectors, clinicians, developers of instrumentation, tools, products and services for genomics and bioinformatics, funders, investors, other academic organisations, and government departments. During these events we highlight areas where CSP3 research is generating impact, share tools that we develop, and highlight opportunities for knowledge transfer and collaboration. An article about the event was produced by our Comms Team which enabled us to share more widely the discussions held at the event. EI Innovate events have gone on to foster exciting and valuable conversations between academia and industry. An example of an exciting collaboration that resulted from a previous EI Innovate is the Hybrid Wheat Initiative, which connects 25 breeding companies and research institutes worldwide, to resolve the critical challenge of hybrid wheat.
Sectors Aerospace

Defence and Marine

Agriculture

Food and Drink

Digital/Communication/Information Technologies (including Software)

Education

Environment

Healthcare

Pharmaceuticals and Medical Biotechnology

URL https://docs.google.com/document/d/1Rcsbz1DehajwBkKPssPCLOmG9hallDQGsWeKWNWiHNo/edit?usp=sharing
 
Description The scale and complexity of data being harnessed by the bioscience research and innovation community grew significantly over the last decade. Innovative data-driven approaches are seen now as key to unlocking new biological understanding and maximising the value of the data that are available from a diverse range of advanced technologies. As part of the Computational Developments work package (CSP3) extended objectives, we set out to develop new technologies, algorithms and standards to enable large scale data analysis, interpretation and integration. The impact of this work package is demonstrated in our open source tools, technologies, and web services that have been taken up by large academic programmes, individual research projects, and by industrial counterparts. Below we list the major areas where CSP3 is generating socio-economic impact: 1. Achieving sustainable wheat through data infrastructure. The developments of CSP3 enabled us to contribute to computational systems for the BBSRC-funded cross-institute Designing Future Wheat (DFW) project. DFW aimed to develop new wheat varieties (germplasm) containing the next generation of key traits and deliver them into the hands of academia and industry. Large-scale field trials have taken place and resulted in the generation of large and diverse datasets. The Grassroots data management platform was developed as an interoperable platform for sharing data and tools to examine and standardise access to wheat data. Grassroots has allowed the collection and standardisation of quantitative and qualitative data such as treatments, trait measurements and phenotypes in field trials, maximising the availability of the trait data collected in DFW along with doing some basic analyses upon this data. All of this has been made available as a web-based portal (https://grassroots.tools/fieldtrial/all) for users to browse, search and download. As DFW partners are working together to produce large and heterogeneous datasets, the existing Grassroots infrastructure needed access to this ambitious scale of data for the programme, while also allowing fast, openly-licensed unrestricted access to data and experimental information to the wider international wheat community. This data is available on the DFW data portal (https://opendata.earlham.ac.uk/wheat/under_license/toronto/), hosted at EI, and also is part of the Grassroots Infrastructure. EI provided important contributions to DFW and other BBSRC programmes and projects to make data freely accessible, and support the development of new wheat varieties that contribute to food security and climate resilience. 2. Custom bioinformatics software for rapid identification of pathogens: MARTi (Metagenomic Analysis in Real Time) is open source software with a permissive licence which opens it to adaptation for commercial applications. One example of this is that MARTi has been used as a code base to develop Airscreen, an analysis pipeline for the DARPA SIGMA+ biothreat detection system which is being developed by Kromek Plc based upon science developed at EI. MARTi provides a bioinformatic platform for the Air-seq technology that has been in development at EI since 2015, and more recently in collaboration with researchers at the Natural History Museum. Originally conceived as a tool for in-field surveillance of agricultural pathogens (particularly fungal pathogens that threaten some of our most crucial crops), Air-seq's focus has been expanded to include the detection of bacterial and viral human pathogens in urban environments, working with Kromek - a UK-based sensor engineering company - as part of a project funded by the US agency DARPA (Defense Advanced Research Projects Agency). Air-seq represents a truly unbiased technology capable of detecting any biological agent. The future applications of this technology are far-reaching. A patent application that covers the algorithms behind Airscreen was filed to protect this intellectual asset and ensure maxim impact realisation via commercialisation. We are currently formulating a series of product and service concepts for applications in agriculture, public health (as demonstrated with the outbreak of COVID-19), homeland security, and conducting field trials with potential end users and licensees. Air-seq has an opportunity to demonstrate a significant impact of a mobile bio-surveillance system enabled through automated detection and identification of airborne pathogens using DNA sequencing and bioinformatics tools developed at EI. Training and skills development impact: The CyVerse UK team provided resources for EI training programmes, including Single Cell RNAseq (7-10 November 2022). Genome Annotation (17-19 May 2022) and Nanopore Metagenomics (26-28 Apr 2022). These courses are continued at EI and supported by the development of new Institute Strategic Programmes. Richard Leggett and Darren Heavens led a 3-day training course entitled "Nanopore metagenomics: from sample to analysis" which provided an introduction to wet lab and bioinformatics approaches for nanopore-based metagenomics for a group of attendees from around the world. The course received the support of Oxford Nanopore Technologies, who provided some consumables and fielded some staff to take part in a Q&A session at the end of the course.
First Year Of Impact 2017
Sector Aerospace, Defence and Marine,Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Education,Environment
Impact Types Cultural

Societal

Economic

 
Description BBSRC 20RM2 Committee
Geographic Reach National 
Policy Influence Type Membership of a guideline committee
 
Description BEIS/UKRI/RCUK Cloud Workshop, London, 24-10-2017
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
 
Description BecA-ILRI Hub & Strategic Partner Alignment 2019
Geographic Reach Africa 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Covid-19 Agile Response Panel
Geographic Reach National 
Policy Influence Type Membership of a guideline committee
 
Description Covid-19 Rapid Response Call - Panel Meeting
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
Impact Rob Davey is a member of the UKRI-BBSRC COVID-19 cohort of experts. As an expert in an area of research related to the attached COVID-19 proposal that BBSRC has received and is considering for funding.
 
Description Input into "Balance and Effectiveness of Research and Innovation Spending" inquiry by the House of Commons Science and Technology Select Committee
Geographic Reach National 
Policy Influence Type Citation in other policy documents
URL https://publications.parliament.uk/pa/cm201719/cmselect/cmsctech/1453/145303.htm#_idTextAnchor000
 
Description Interview with Environment Adviser from the UK Parliamentary Office of Science and Technology
Geographic Reach National 
Policy Influence Type Implementation circular/rapid advice/letter to e.g. Ministry of Health
Impact Contacted by UK Parliament to contribute to a POSTnote (short document to advise ministers on a given topic) on genebanks and Digital Sequence Information as a result of my recent election to the DivSeek Board of Directors. I was interviewed to provide information around current international policies on DSI and how future UK involvement might be shaped around open licencing/MTAs of DSI datasets.
URL https://www.parliament.uk/postnotes
 
Description Letter to Rt Hin Chris Skidmore to discuss science funding and science immigration post-brexit
Geographic Reach National 
Policy Influence Type Implementation circular/rapid advice/letter to e.g. Ministry of Health
 
Description NERC Added Value Panel for the Environmental Data Services (EDS)
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description UKRI BBSRC Review of Data-intensive Bioscience
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
 
Description UKRI Data Infrastructure Roadmap
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
 
Description UKRI Innovation Scholars: Data Science Training in Health & Bioscience Panel
Geographic Reach National 
Policy Influence Type Membership of a guideline committee
 
Description UKRI Open Access Policy Review
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
Impact Rob Davey attended the policy review briefing: UKRI is reviewing its Open Access policy for peer-reviewed research articles and academic books, and it is seeking views on its proposed new policy via a consultation.
 
Description UKRI Supercomputing Roadmap
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
 
Description UKRI e-Infrastructure Expert Panel
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
Impact The UKRI Infrastructure landscape analysis and report constitutes the largest review of UK infrastructure across academic, industrial and the third sector. Davey and Hall N (EI) and Fretter (NBIP CiS) contributed key information to the analysis and reports to reflect the views and future strategy of the UK bioscience sector. This report has now been used to formulate the 2020 UK Budget with an announcement of large scale investment in infrastructure.
URL https://www.ukri.org/research/infrastructure/
 
Description Agri-Tech in China: Newton Network+ - Developing a novel aerial image analysis algorithm to enable the timing estimation of fertilisation and chemical applications for wheat in China.
Amount £29,432 (GBP)
Funding ID SM003 
Organisation Department for Business, Energy & Industrial Strategy 
Sector Public
Country United Kingdom
Start 03/2018 
End 06/2018
 
Description Algebraic Invariants for Phylogenetic Network Inference
Amount £64,068 (GBP)
Funding ID EP/W007134/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2022 
End 10/2023
 
Description Artificial intelligence and deep learning in image based crop phenomics for predicting seed quality
Amount £99,034 (GBP)
Funding ID BB/S507441/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2018 
End 09/2023
 
Description BBSRC NRPDTP iCASE Studentship - Using open data and machine learning approaches to decode the regulatory regions of wheat
Amount £28,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2021 
End 09/2025
 
Description BBSRC NRPDTP studentship - Data science to feed a changing world: superior drought tolerance in nutritious traditional beans
Amount £20,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2021 
End 09/2025
 
Description BBSRC NRPDTP studentship - Unbiased detection of emergent airborne pathogens using Air-seq
Amount £20,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2021 
End 09/2025
 
Description Bayer Grants4Traits
Amount £20,661 (GBP)
Organisation Bayer 
Department Bayer CropScience Ltd
Sector Private
Country United Kingdom
Start 01/2018 
End 12/2018
 
Description BioFAIR: A UK Institute delivering a Commons for data-driven bioscience
Amount £300,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 07/2020 
End 05/2021
 
Description Biodiversity Genomics Europe
Amount £221,419 (GBP)
Funding ID 10055933 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 08/2022 
End 02/2026
 
Description Building capacity in third-generation genomics and bioinformatics for agricultural biosciences in Africa
Amount £99,952 (GBP)
Funding ID BB/T017422/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 01/2021 
End 03/2022
 
Description Building capacity in third-generation genomics and bioinformatics for agricultural biosciences in Africa
Amount £99,952 (GBP)
Funding ID BB/T017422/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 03/2020 
End 03/2021
 
Description Business Case for a Catalyst Partnership in Artificial Intelligence between the Alan Turing Institute and the Norwich Biosciences Institutes
Amount £600,000 (GBP)
Funding ID BB/V509267/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2020 
End 06/2022
 
Description DNA sequencing for biological threat monitoring
Amount $5,270,000 (USD)
Funding ID HR001119C0031 
Organisation Defense Advanced Research Projects Agency (DARPA) 
Sector Public
Country United States
Start 12/2018 
End 12/2023
 
Description Darwin Tree of Life
Amount £9,360,421 (GBP)
Funding ID 218328/Z/19/Z 
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 11/2019 
End 05/2023
 
Description Developing capabilities in high performance digital infrastructure for data intensive scientific innovation in Colombia
Amount £29,500 (GBP)
Funding ID IAPP18-19\294 
Organisation Royal Academy of Engineering 
Sector Charity/Non Profit
Country United Kingdom
Start 04/2019 
End 12/2020
 
Description ELIXIR-UK Coordination Office
Amount £582,025 (GBP)
Funding ID BB/X011100/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 06/2022 
End 07/2024
 
Description ELIXIR-UK: FAIR Data Stewardship training
Amount £79,653 (GBP)
Funding ID MR/V038966/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 02/2021 
End 03/2023
 
Description FTMA4 - Earlham Institute Flexible Talent Mobility Account
Amount £107,000 (GBP)
Funding ID BB/X017761/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 01/2023 
End 03/2023
 
Description Frictionless Data
Amount £3,333 (GBP)
Organisation Open Knowledge Foundation 
Sector Charity/Non Profit
Country United Kingdom
Start 05/2020 
End 12/2020
 
Description Grand Challenges Research Fund (GCRF) Data & Resources (EI) - extension
Amount £160,963 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 11/2017 
End 07/2018
 
Description High-resolution genomics to reveal changes in microbial biodiversity across space and time in the warming Arctic Ocean
Amount £471,632 (GBP)
Funding ID NE/W005654/1 
Organisation Natural Environment Research Council 
Sector Public
Country United Kingdom
Start 07/2022 
End 07/2025
 
Description High-resolution genomics to reveal changes in microbial biodiversity across space and time in the warming Arctic Ocean
Amount £32,737 (GBP)
Funding ID NE/W005654/1 
Organisation Natural Environment Research Council 
Sector Public
Country United Kingdom
Start 06/2022 
End 06/2025
 
Description IPA Industrial Partnering Award - China Partnering Awards - Forge a long-term UK-China relationship in phenotyping, Agri-Tech innovation and crop research for Rice and Wheat
Amount £30,429 (GBP)
Funding ID BB/R021376/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 03/2018 
End 03/2021
 
Description Identification of prognostic indicators of healthy ageing with a machine learning based systems biology approach using gut microbiome data
Amount £99,034 (GBP)
Funding ID BB/S50743X/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 08/2018 
End 09/2022
 
Description Integrated SERvices supporting a sustainable AGROecological transition
Amount £48,477 (GBP)
Funding ID 10042903 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 08/2022 
End 08/2027
 
Description Integration of COPO and CGCore Schemas and Associated Repositories
Amount £62,968 (GBP)
Funding ID 2018X329.EI 
Organisation International Food Policy Research Institute (IFPRI) 
Sector Charity/Non Profit
Country United States
Start 08/2018 
End 01/2019
 
Description Interactive real-time metagenomics algorithms for Nanopore sequencing (LEGGETT_E17DTP1)
Amount £90,000 (GBP)
Funding ID 1937486 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2017 
End 09/2021
 
Description Interim BioFAIR award
Amount £358,676 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 08/2023 
End 03/2025
 
Description JIF internship
Amount £81,140 (GBP)
Organisation John Innes Foundation 
Sector Charity/Non Profit
Country United Kingdom
Start 03/2021 
End 03/2025
 
Description John Innes Centre Institute Strategy Funding - The elucidation of transcription factor networks underling internode extension in wheat
Amount £13,911 (GBP)
Organisation John Innes Centre 
Sector Academic/University
Country United Kingdom
Start 01/2018 
End 12/2018
 
Description NRP Seed fund - A GPU-accelerated machine-learning based agricultural vehicle navigation system for crop monitoring and trait analysis
Amount £2,500 (GBP)
Funding ID SLSF 74 
Organisation Norwich Research Park 
Sector Private
Country United Kingdom
Start 03/2018 
End 08/2018
 
Description New software for nanopore based diagnostics and surveillance
Amount £151,571 (GBP)
Funding ID BB/R022445/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2018 
End 01/2020
 
Description OpenPlant - Developing a frugal transcription factor relative affinity measurement pipeline (TRAMP)
Amount £5,000 (GBP)
Organisation University of Cambridge 
Sector Academic/University
Country United Kingdom
Start 08/2018 
End 07/2019
 
Description PhD studentship - Using open data and machine learning approaches to decode the regulatory regions of plants
Amount £126,032 (GBP)
Organisation John Innes Foundation 
Sector Charity/Non Profit
Country United Kingdom
Start 05/2021 
End 05/2025
 
Description Ship-seq: Nanopore sequencing of polar microbes on board icebreakers
Amount £90,000 (GBP)
Funding ID 1942119 
Organisation Natural Environment Research Council 
Sector Public
Country United Kingdom
Start 09/2017 
End 05/2021
 
Description Strategic Priorities Fund - AI for Science, Engineering, Health and Government
Amount £38,800,000 (GBP)
Funding ID EP/T001569/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 11/2018 
End 03/2023
 
Description Surveillance of yellow rust genotypes from sequenced aerosols
Amount £20,000 (GBP)
Funding ID LEGGETT_E23CASE 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2023 
End 09/2027
 
Description The Earlham Institute 2021 Flexible Talent Mobility Account
Amount £108,000 (GBP)
Funding ID BB/W510890/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 12/2021 
End 03/2022
 
Description The European Open Science Cloud (EOSC) Future Sub-grant
Amount £9,615 (GBP)
Funding ID 101017536 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 03/2022 
End 06/2022
 
Description UKRI - Earth Biogenome Project
Amount £600,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 03/2019 
End 03/2022
 
Title CGCore v2 Improvements 
Description As part of the collaboration between the EI COPO project and the CGIAR Big Data Platform, we worked with CGIAR and Crop Ontology developers to improve the CG Core v2 schema for describing CGIAR digital outputs. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact Globally, this work will affect all CGIAR Data Managers and users of the COPO platform to deposit data into CG Centre repositories. 
URL https://github.com/collaborative-open-plant-omics/cgcore_schema
 
Title Host-Microbe interaction workflow 
Description Integrated Computational workflow to infer the effect of bacterial proteins on host processes 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? No  
Impact Targeted interplay between bacterial pathogens and host autophagy. Sudhakar P, Claire-Jacomin A, Hautefort I, Samavedam S, Fatemian K, Ari E, Gul L, Demeter A, Jones E, Korcsmaros T and Nezis I. Autophagy 2019. (In Press). Host-Microbe interaction pipeline based on protein-protein interactions - applicable to single species and community wide microbial proteomic data. Sudhakar P, Andrighetti T, Gul L, Fazekas D, Korcsmaros T. (in preparation - to be submitted April 2019) 
 
Title Improvements to the COPO system 
Description COPO is a computational system that attempts to address the challenges of making data FAIR by enabling scientists to describe their research objects (raw or processed data, publications, samples, images, etc.) using community-sanctioned metadata sets and vocabularies, and then use public or institutional repositories to share it with the wider scientific community. COPO encourages data generators to adhere to appropriate metadata standards when publishing research objects, using semantic terms to add meaning to them and specify relationships between them. This allows data consumers, be they people or machines, to find, aggregate, and analyse data which would otherwise be private or invisible. Building upon existing standards to push the state of the art in scientific data dissemination whilst minimising the burden of data publication and sharing. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact Improvements to the COPO user interfaces and underlying code which have resulted in more data being submitted to public repositories through the system. The CGIAR CGCore v2 implementation is complete and undergoing testing to document and provide improvements. The Darwin Tree of Life project has chosen to use COPO as its main sample metadata submission route. 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title MulEA: An R package for Multi Enrichment Analysis 
Description Bioinformatics, genomics analysis tool MulEA is a comprehensive gene set overrepresentation and enrichment Bioconductor R package and Galaxy tool Github link : https://github.com/TGAC/Mulea 
Type Of Material Improvements to research infrastructure 
Year Produced 2017 
Provided To Others? No  
Impact MulEA will provide extensive analytical means using diverse databases, statistical models and p-value correction procedures that can extend our understanding of the results of various high-throughput analyses. 
 
Title Opening of Firewall 
Description Interacted with NBI computing to have exceptions made in the firewall to allow data uploads from external sources to COPO staging area 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact COPO is now accessible to the rest of the world 
 
Title Systems biology of gut organoids 
Description Since its first publication in 2009, the in vitro organoid model has been recognized as a major technological breakthrough tool in many basic biology and clinical applications. Organoids are near-physiological 3D model systems that facilitate studying a range of in vivo biological processes including cell differentiation, antimicrobial peptide production, host-microbe interactions and cell-cell communications. Organoids can also used to examine the effect of certain mutations by generating organoids from transgenic mice strains. The key aim of this joint project is to perform 'omics analyses of the generated organoids, examine the differences between cell-types and disease conditions using computational approaches, as well as generate and validate testable hypotheses regarding the affect of the identified functional differences. 
Type Of Material Model of mechanisms or symptoms - in vitro 
Year Produced 2017 
Provided To Others? No  
Impact In the last 12 months we 1) performed proteomics analysis of the organoids that allowed to identify biomedically relevant and scientifically interesting differences ; 2) tested and established a protocol/pipeline to perform RNAseq and microRNA profiling at Earlham Institute. ; 3) Our in silico analysis of the proteomic datasets also confirmed the differentiated organoids do express proteins expected for the given cell types (e.g., enzymes important in peptide synthesis in Paneth cell differentiated organoids). By analysing interaction data we interestingly pointed out that autophagy (cellular self-eating) could degrade many of the cell-specific key proteins. ATG16L1 is an autophagy protein involved in selecting proteins for autophagy-driven degradation, and it also contributes to the general autophagy process.; 4) In a second series of experiments we have generated organoids from the intestine of mice generated at UEA that lack expression of Atg16L1 specifically in intestinal epithelial cells (Atg16L1 dIEC). The proteomic profiles of these autophagy deficient Paneth cell and goblet cell organoids were also carried out at the University of Liverpool. Strikingly, when we analysed the proteins whose protein levels differed in the autophagy deficient background, we found several of them are known or predicted to be degraded through autophagy. In Paneth cell organoids, the functional analysis of those proteins whose level was significantly higher in the ATG16L1 knock-out background (potentially as they have not been degraded properly) pointed out their importance in 'acute inflammatory response', 'immune response', 'negative regulation of gene expression', and 'protein processing'. This list indicates that autophagy malfunction negatively affects protein (antimicrobial peptide) production as well as deregulates inflammatory responses. Failure of these process are well known for Paneth cells in Crohn's disease but so far they have not been connected directly with autophagy malfunction. We found similar results for the mucosa producing goblet cells. To confirm these findings, we will directly examine some affected key functions using organoids. Reference: Integrative analysis of Paneth cell proteomic and transcriptomic data from intestinal organoids reveals functional processes dependent on autophagy. Emily Jones*, Zoe Matthews*, Lejla Gul*, Padhmanand Sudhakar, Agatha Treveil, Devina Divekar, Jasmine Buck, Tomasz Wrzesinski, Matthew Jefferson, Stuart Armstrong, Lindsay Hall, Alastair Watson, Simon Carding, Wilfried Haerty, Federica Di Palma, Ulrike Mayer, Penny Powell, Isabelle Hautefort, Tom Wileman, Tamas Korcsmaros. Disease Models and Mechanisms 2019. http://dmm.biologists.org/content/early/2019/02/26/dmm.037069. (* joint first authors) 
 
Title Additional file 2 of An accessible, efficient and global approach for the large-scale sequencing of bacterial genomes 
Description Additional file 2: Includes supplementary tables S1, S2, S3, S4, and S5. Table S1. Metadata template form. Table S2. Optimization of bacterial thermolysates generation and DNA extraction. Table S3. Metadata for sequenced isolates, including bioinformatic stats for Salmonella genomes. Table S4. Bespoke 9 bp barcodes for library construction using the LITE pipeline. Table S5. European Nucleotide Archive accession numbers. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_An_accessible_efficient_an...
 
Title Additional file 2 of An accessible, efficient and global approach for the large-scale sequencing of bacterial genomes 
Description Additional file 2: Includes supplementary tables S1, S2, S3, S4, and S5. Table S1. Metadata template form. Table S2. Optimization of bacterial thermolysates generation and DNA extraction. Table S3. Metadata for sequenced isolates, including bioinformatic stats for Salmonella genomes. Table S4. Bespoke 9 bp barcodes for library construction using the LITE pipeline. Table S5. European Nucleotide Archive accession numbers. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_An_accessible_efficient_an...
 
Title Additional file 2 of Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes 
Description Additional file 2: Supplementary file 1. Genomic coordinates and annotation across species for all SV classes. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_2_of_Analysis_of_structural_variants_in...
 
Title Additional file 2 of Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes 
Description Additional file 2: Supplementary file 1. Genomic coordinates and annotation across species for all SV classes. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_2_of_Analysis_of_structural_variants_in...
 
Title Additional file 2 of microRNA profiling in the Weddell seal suggests novel regulatory mechanisms contributing to diving adaptation 
Description Additional file 2: Table S1. small RNA reads aligned against the final set of 559 high confidence miRNA loci. For each locus, the full hairpin sequence is shown, followed by the set of reads (one per line) perfectly matching the locus (with the corresponding abundance) and the predictive secondary structure of the miRNA hairpin. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_2_of_microRNA_profiling_in_the_Weddell_...
 
Title Additional file 2 of microRNA profiling in the Weddell seal suggests novel regulatory mechanisms contributing to diving adaptation 
Description Additional file 2: Table S1. small RNA reads aligned against the final set of 559 high confidence miRNA loci. For each locus, the full hairpin sequence is shown, followed by the set of reads (one per line) perfectly matching the locus (with the corresponding abundance) and the predictive secondary structure of the miRNA hairpin. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_2_of_microRNA_profiling_in_the_Weddell_...
 
Title Additional file 3 of Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes 
Description Additional file 3: Supplementary file 2.Results of MW test to compare SV size distribution across different conservation categories. For each comparison, the p-value is indicated. In the case of a significant difference, the directionality of the change is indicated. For example, "1 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_3_of_Analysis_of_structural_variants_in...
 
Title Additional file 3 of Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes 
Description Additional file 3: Supplementary file 2.Results of MW test to compare SV size distribution across different conservation categories. For each comparison, the p-value is indicated. In the case of a significant difference, the directionality of the change is indicated. For example, "1 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_3_of_Analysis_of_structural_variants_in...
 
Title Additional file 3 of Evolution of regulatory networks associated with traits under selection in cichlids 
Description Additional file 3. Large data Tables S1-S6. This file includes extended data tables that support the findings of this study. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_3_of_Evolution_of_regulatory_ne...
 
Title Additional file 3 of Evolution of regulatory networks associated with traits under selection in cichlids 
Description Additional file 3. Large data Tables S1-S6. This file includes extended data tables that support the findings of this study. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_3_of_Evolution_of_regulatory_ne...
 
Title Additional file 3 of microRNA profiling in the Weddell seal suggests novel regulatory mechanisms contributing to diving adaptation 
Description Additional file 3: Table S2. Overview of all samples used in this study. Table S3. number of reads mapped to each miRNA locus across all samples. Table S4. number of reads mapped to each miRNA locus across all samples. Table S5. Tissue-specific differential expression statistics considering all pairwise tissue comparisons (brain_heart, brain_muscle, brain_plasma, heart_muscle, heart_plasma, muscle_plasma). Fold change refers to changes from tissue_1 to tissue_2, as defined by column H. Table S6. Age-specific differential expression statistics considering all tissues together. Fold change refers to changes from pup to adult (i.e. positive sign indicates upregulation in adult). Table S7. Tissue-specific differential expression statistics for developmental stage (adult versus pup). Table S8. Significant pathway enrichments in brain, heart, and muscle for mRNA targets of all microRNAs differentially expressed in Weddell seal maturation. Table S9. Enriched KEGG pathways associated with mRNA targets of novel miRNAs upregulated (+) or downregulated (-) in four Weddell seal sample types. Table S10. Number of significant pathway enrichments annotated to mRNA targets of novel Weddell seal microRNAs that were elevated (+) or decreased (-) in four sample types. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_3_of_microRNA_profiling_in_the_Weddell_...
 
Title Additional file 3 of microRNA profiling in the Weddell seal suggests novel regulatory mechanisms contributing to diving adaptation 
Description Additional file 3: Table S2. Overview of all samples used in this study. Table S3. number of reads mapped to each miRNA locus across all samples. Table S4. number of reads mapped to each miRNA locus across all samples. Table S5. Tissue-specific differential expression statistics considering all pairwise tissue comparisons (brain_heart, brain_muscle, brain_plasma, heart_muscle, heart_plasma, muscle_plasma). Fold change refers to changes from tissue_1 to tissue_2, as defined by column H. Table S6. Age-specific differential expression statistics considering all tissues together. Fold change refers to changes from pup to adult (i.e. positive sign indicates upregulation in adult). Table S7. Tissue-specific differential expression statistics for developmental stage (adult versus pup). Table S8. Significant pathway enrichments in brain, heart, and muscle for mRNA targets of all microRNAs differentially expressed in Weddell seal maturation. Table S9. Enriched KEGG pathways associated with mRNA targets of novel miRNAs upregulated (+) or downregulated (-) in four Weddell seal sample types. Table S10. Number of significant pathway enrichments annotated to mRNA targets of novel Weddell seal microRNAs that were elevated (+) or decreased (-) in four sample types. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_3_of_microRNA_profiling_in_the_Weddell_...
 
Title Additional file 4 of Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes 
Description Additional file 4: Supplementary file 3. Names and corresponding GO annotation for different subsets of genes inside duplicated and inverted regions 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_4_of_Analysis_of_structural_variants_in...
 
Title Additional file 4 of Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes 
Description Additional file 4: Supplementary file 3. Names and corresponding GO annotation for different subsets of genes inside duplicated and inverted regions 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_4_of_Analysis_of_structural_variants_in...
 
Title Additional file 5 of Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes 
Description Additional file 5: Supplementary file 4. Genes found inside PCR validated inversions. Each row corresponds to one gene. Genomic coordinates of the associated inversion, and the number of species carrying the inversion are indicated, along with the gene name and ncbi id. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_5_of_Analysis_of_structural_variants_in...
 
Title Additional file 5 of Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes 
Description Additional file 5: Supplementary file 4. Genes found inside PCR validated inversions. Each row corresponds to one gene. Genomic coordinates of the associated inversion, and the number of species carrying the inversion are indicated, along with the gene name and ncbi id. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_5_of_Analysis_of_structural_variants_in...
 
Title SalmoNet2.0 
Description An integrated network resource containing regulatory, metabolic and protein-protein interactions For multiple Salmonella strains classified as gastro-intestinal or extra-intestinal pathogens An interaction resource with manually curated, high-throughput and predicted interactions Provides a strain specific and consensus networks Can be downloaded in a user-specified content and format 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
Impact SalmoNet, an integrated network of ten Salmonella enterica strains reveals common and distinct pathways to host adaptation Métris A., Sudhakar P., Fazekas D., Demeter A., Ari E., Branchu P, Kingsley R.A., Baranyi J., Korcsmáros T. npj Systems Biology and Applications 3, Article number: 31 (2017) doi:10.1038/s41540-017-0034-z 
URL http://salmonet.org/
 
Title Sherlock - big data analytics platform for bioinformatics data 
Description The Sherlock tool is utilizing standard, open source big data technologies (like S3, Presto and docker) in order to execute simple analytical SQL queries on top of the integrated bioinformatics data organized into an S3-based data lake. 
Type Of Material Data analysis technique 
Year Produced 2019 
Provided To Others? No  
Impact The method will help to increase the productivity of data heavy bioinformatics projects, easing the data cleaning, filtering and integration related tasks which are usually the first steps in each complex bioinformatics pipeline. Github link : https://github.com/NetBiol/sherlock 
 
Title Supporting data for "A Galaxy-based training resource for single-cell RNA-seq quality control and analyses" 
Description It is not a trivial step to move from single-cell RNA-seq (scRNA-seq) data production to data analysis. There is a lack of intuitive training materials and easy-to-use analysis tools, and researchers can find it difficult to master the basics of scRNA-seq quality control and the later analysis.
We have developed a range of practical scripts, together with their corresponding Galaxy wrappers, that make scRNA-seq training and quality-control accessible to researchers previously daunted by the prospect of scRNA-seq analysis. We implement a 'visualise-filter-visualise' paradigm through simple command-line tools that use the Loom format to exchange data between the tools. The point-and-click nature of Galaxy makes it easy to assess, visualise, and filter scRNA-seq data from short-read sequencing data.
We have developed a suite of scRNA-seq tools that can be used for both training and more in-depth analyses. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
 
Title Supporting data for "Aequatus: An open-source homology browser" 
Description Phylogenetic information inferred from the study of homologous genes helps us to understand the evolution of genes and gene families, including the identification of ancestral gene duplication events as well as regions under positive or purifying selection within lineages. Gene family and orthogroup characterisation enables the identification of syntenic blocks, which can then be visualised with various tools. Unfortunately, currently available tools display only an overview of syntenic regions as a whole, limited to the gene level, and none provide further details about structural changes within genes, such as the conservation of ancestral exon boundaries amongst multiple genomes.
We present Aequatus, a standalone web-based tool that provides an indepth view of gene structure across gene families, with various options to render and filter visualisations. It relies on pre-calculated alignment and gene feature information typically held in, but not limited to, the Ensembl Compara and Core databases. We also offer Aequatus.js, a reusable JavaScript module that fulfils the visualisation aspects of Aequatus, available within the Galaxy web platform as a visualisation plugin, which can be used to visualise gene trees generated by the GeneSeqToFamily workflow.
Aequatus is an open-source tool freely available to download under the MIT license at https://github.com/TGAC/Aequatus. A demo server is available at http://aequatus.earlham.ac.uk/. A publicly available instance of the GeneSeqToFamily workflow to generate gene tree information and visualise it using Aequatus is available on the Galaxy EU server at https://usegalaxy.eu 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
 
Title Supporting data for "Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal." 
Description Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the correlation between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but relies on good quality high-molecular-weight DNA. The funds to generate and combine these data are often only available within large consortiums and sequencing initiatives, and are often not affordable for many independent research groups. For many researchers, value-for-money is a key factor when considering the generation of genomic sequencing data. Here we use a range of different genomic technologies generated from a roadkill European Polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses.
Generally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies.
The high degree of variability between each de novo assembly method (assessed from the seven key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies does not always results in better assemblies so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value-for-money when sequencing genomes. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL http://gigadb.org/dataset/100731
 
Title The Earlham Institute CKAN Digital Repository 
Description The CKAN digital repository has been set up as part of WP3 of Earlham Institute's CSP to hold all EI strategic publications alongside any supplementary datasets and information. This gives the public and researchers immediate access to EI's BBSRC funded research through open access routes where available. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact We have buit scripts to find and make available open access versions of all EII published research, either as preprints or as journal articles. We also supply any supplementary information as appropriate to aid information dissemination. The EI CKAN runs within Earlham Institute's CyVerse UK National Capability. 
URL https://ckan.earlham.ac.uk
 
Description ACACIA Bioinformatics Community of Practice (BixCoP) 
Organisation International Livestock Research Institute (ILRI)
Country Kenya 
Sector Charity/Non Profit 
PI Contribution Members of EI delivered training throughout the year for the BixCoP fellowship programme.
Collaborator Contribution The GCRF STARS project was led by JIC and hosted at BeCA-Hub ILRI in Nairobi.
Impact The training programme trsulted in a group of Fellows ready to take their skills back into their home countries and communities, with some undertaking Carpentries instructor training so that they can lead their own training courses in those communities.
Start Year 2018
 
Description ACACIA Bioinformatics Community of Practice (BixCoP) 
Organisation John Innes Centre
Country United Kingdom 
Sector Academic/University 
PI Contribution Members of EI delivered training throughout the year for the BixCoP fellowship programme.
Collaborator Contribution The GCRF STARS project was led by JIC and hosted at BeCA-Hub ILRI in Nairobi.
Impact The training programme trsulted in a group of Fellows ready to take their skills back into their home countries and communities, with some undertaking Carpentries instructor training so that they can lead their own training courses in those communities.
Start Year 2018
 
Description BASF 
Organisation BASF
Country Germany 
Sector Private 
PI Contribution We are using the the promoter capture platform developed in (https://doi.org/10.1093/gigascience/giz018) to sequence wheat cultivars
Collaborator Contribution Paying for sequencing and capture
Impact Funded ICASE studentship "Using open data and machine learning approaches to decode the regulatory regions of wheat"
Start Year 2020
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation Alexander von Humboldt Biological Resources Research Institute
Country Colombia 
Sector Charity/Non Profit 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation CGIAR
Department International Center for Tropical Agriculture
Country Colombia 
Sector Charity/Non Profit 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation Catholic University of Colombia
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation Center for Bioinformatics and Computational Biology of Colombia
Country Colombia 
Sector Charity/Non Profit 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation Colombian Agricultural Research Corporation
Country Colombia 
Sector Charity/Non Profit 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation Colombian Sugarcane Research Center
Country Colombia 
Sector Public 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation CorpoGen
Country Colombia 
Sector Private 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation EAFIT University
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation EAFIT University
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation Jorge Tadeo Lozano University
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation National University of Colombia
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation Saint Thomas Aquinas University
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation The National Academic Network of Advanced Technology
Country Colombia 
Sector Public 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation University of Antioquia
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation University of the Andes
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description C3Biodiversidad: Colombian Cyberinfrastructure Consortium for Biodiversity (Consorcio Colombiano de Ciberinfraestructura para la Biodiversidad) 
Organisation University of the Llanos
Country Colombia 
Sector Academic/University 
PI Contribution C3Biodiversidad was created in a visionary workshop in Bogota from 26th to 28th June organised by GROW by a panel of experts from the Science, Technology and Innovation (STI) system of Colombia, with the assistance of a panel of independent international experts . C3Biodiversidad is open to any stakeholder interested in the development of a scientific cyberinfrastructure in Colombia.
Collaborator Contribution C3Biodiversidad aims to develop a scientific cyberinfrastructure in Colombia for the analysis of scientific data, especially biological, genomic and socioeconomic data. The objectives of C3Biodiversidad are: Growth skills in data analysis in Colombia, accelerate data-oriented research in Colombia, facilitate data-supported decision-making, secure the engagement of diverse stakeholders in the previous objectives.
Impact The Colombian Consortium of Cyberinfrastructure for Biodiversity aims to produce the following products for dissemination: A statement about its objectives and constitution, a strategy or white paper for the dissemination of the conclusions of the workshop in policy-making institutions, especially in Colombia and the United Kingdom, instruments for the coordination of the consortium using social networks (Slack, WhatsApp, Twitter @C3Biodiversidad, etc.), an informative note for dissemination in the national media, especially from Colombia and the United Kingdom, and an article in an international scientific journal.
Start Year 2018
 
Description Collaboration with Oxford Nanopore Technologies 
Organisation Oxford Nanopore Technologies
Country United Kingdom 
Sector Private 
PI Contribution As part of the upcoming EI strategic programme we will be producing novel protocols for single cell long read RNA and DNA sequencing as well as novel approaches to detect DNA replication through base modification
Collaborator Contribution ONT will contribute through providing expertise in machine learning models applications (base modification), technology solutions and protocols developments for single cell RNA / DNA seq
Impact Engagement with ONT on March 7th to develop the collaboration as part of the upcoming ISPs
Start Year 2022
 
Description Cory Bernhards, US Army DEVCOM Chemical Biological Centre 
Organisation US Army
Country United States 
Sector Public 
PI Contribution Knowledge exchange around air sampling, pathogen detection, nanopore metagenomics, homeland security and biothreat detection. Also training in use of EI's MARTi software.
Collaborator Contribution Knowledge exchange around air sampling, pathogen detection, nanopore metagenomics, homeland security and biothreat detection.
Impact FTMA award to enable Richard Leggett, Darren Heavens and Ned Peel to travel to the US to meet with Cory and his team.
Start Year 2023
 
Description Developing a concept for a prototype Air-seq device 
Organisation Technology Partnership Plc
Country United Kingdom 
Sector Academic/University 
PI Contribution Earlham developed the Air-seq workflow for detection of biological material from air using bespoke sequencing and bioinofmratics pipeline. Earlham is looking to develop commercial solutions based on this technology and commercialise via a spin-out. The prototype development will inform the business plan and investment needed.
Collaborator Contribution Expertise in engineering of diagnostic and detection devices for the life science applications.
Impact Technical products
Start Year 2022
 
Description EI - Digital Catapult 
Organisation Digital Catapult
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Share with Catapult the SME engagement process for SMEs on the MI Garage programme to access Earlham's expertise and resources
Collaborator Contribution Publicly acknowledge the partnership with Earlham and include Earlham in the MI Garage programme plans.
Impact Collaborations
Start Year 2021
 
Description EI-Eagle Genomics 
Organisation Eagle Genomics Ltd
Country United Kingdom 
Sector Private 
PI Contribution Engagement in discussions for collaborative activities. Joined application to UKRI-MRC DTP CASE (Microbes, Microbiome, Bioinformatics DTP Call) "Deconstructing the impact of low-fibre therapeutic diets on the human microbiome through integrated 'omics".
Collaborator Contribution Engagement in workshop and discussions for collaborative activities. Joined application to UKRI-MRC DTP CASE (Microbes, Microbiome, Bioinformatics DTP Call) "Deconstructing the impact of low-fibre therapeutic diets on the human microbiome through integrated 'omics".
Impact Joined application to UKRI-MRC DTP CASE (Microbes, Microbiome, Bioinformatics DTP Call) "Deconstructing the impact of low-fibre therapeutic diets on the human microbiome through integrated 'omics". Further funding. Engagement activities.
Start Year 2022
 
Description ELIXIR Biodiversity Working Group 
Organisation ELIXIR
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Drs Davey and Shaw attended the first ELIXIR Biodiversity working group meeting in Milan 2020. Davey gave a talk on UK efforts to track biodiversity data, for example with the COPO platform.
Collaborator Contribution ELIXIR initiated this working group and invited member ELIXIR nodes to attend.
Impact Main outcome is building the community with a view to submitting an implementation study around biodiversity data.
Start Year 2020
 
Description ELIXIR plant community - COPO listed as a copre plant service 
Organisation ELIXIR
Department ELIXIR UK
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution COPO has been listed, together with a number of other resources, as a core plant service. As COPO is a live service, it;s effectively already contributing to the plant science community improving metadata annotation and submission to repositories processes.
Collaborator Contribution the ELIXIR plant community has been working on standardized practices to improve the output quality of the research. They recognize data hygiene as a driver, and therefore identify core resources to address researchers to.
Impact this is an ongoing collaboration, the main output would come from the standardization of practices across the research domain and diminished time resources requested from scientist in the metadata and submission processes
Start Year 2020
 
Description Genome 10K Consortium 
Organisation University of California, Davis
Department UC Davis Genome Cente
Country United States 
Sector Academic/University 
PI Contribution The G10K consortium is an international consortium of tissue curators, biologists, conservationists, genome scientists, computer scientists, outreach educators, and more. Federica Di Palma is a council member of the Consortium and has provided advice to the current ongoing projects. Organised meetings and contributed talks to several workshops and meetings.
Collaborator Contribution The G10K leadership and community of scientists built an infrastructure from sample collection to genome sequencing, assembly, annotation, alignments, public data releases, and analyses for publications.
Impact Collaboration is multidisciplinaryand includes tissue curators, biologists, conservationists, genome scientists, computer scientists, outreach educators, and more from different countries. Outputs include press releases, workshops, meetings, publications.
Start Year 2011
 
Description HPE AI workshop 2019 
Organisation Hewlett Packard Enterprise (HPE)
Country United Kingdom 
Sector Private 
PI Contribution We worked with HPE staff to organise and host an AI Workshop at EI. We opened the course up for national delegates to attend and discover more about how AI and Machine Learning techniques can be applied to biological research data.
Collaborator Contribution HPE provided the trainers and staff to teach the materials.
Impact We will continue to work with HPE to supply our institutlonal HPE equipment. We have also put forward HPE as a potential partner in the upcoming DTP3 bid.
Start Year 2019
 
Description Integration of COPO and CGCore Schemas and Associated Repositories 
Organisation CGIAR
Country France 
Sector Charity/Non Profit 
PI Contribution We have developed a proof-of-concept platform to streamline metadata attribution and dataset deposition into CGIAR repositories using the BBSRC-funded COPO software. Drs Etuk and Shaw, two Research Software Engineers in the Davey group at Earlham Institute and the original core developers, have implemented various new features into COPO to allow CGIAR Data Managers to harmonise and streamline the submission of CG-relevant metadata and data into the CG digital data repositories. All software and infrastructure is hosted within the CyVerse UK cloud. We have: - Implemented support of CG Core v.2.0. (http://repo.mel.cgiar.org/handle/20.500.11766/4764) metadata annotation of various data types, including publications, produced at the CGIAR institutes via the existing COPO wizard system. - Implemented support of submissions of annotated objects to institutional instances of the following repositories: dSpace (https://www.duraspace.org/dspace/), CKAN (https://ckan.org/) and Dataverse (https://dataverse.org/). - Designed and implemented a mechanism within COPO which controls which users can submit to which repositories. - Implemented support the annotation of variables within data sets (i.e. column headings; experiment condition descriptors etc) with terms and URIs from ontologies or controlled vocabularies/trait dictionaries (AGROVOC and GACS).
Collaborator Contribution CGIAR have provided coordination contributions with key members in the CG Centres to gather feedback on developed elements, as well as provided funds to allow a core CGCore metadata schema developer to travel to EI and work with Drs Etuk and Shaw to improve the CGCore schema.
Impact This collaboration has seen rapid development of key functionality in the COPO platform to support CG centre Data Managers. This has required technical skills to develop the software, biocuration expertise provided by CGIAR to improve and refine the CGCore metadata schema, ontology expertise from the Bioversity team in Montpellier, and coordination expertise from Dr Davey (EI) and Medha Devare (CGIAR). Software and Technical Products (Webtool/Application - Collaborative Open Plant Omics (COPO) (2017)): All software code developed is open source and can be found within the COPO Github repository: https://github.com/collaborative-open-plant-omics/COPO
Start Year 2018
 
Description Jon Lartey (NNUH obstetrics) collaboration 
Organisation Norfolk and Norwich University Hospital
Country United Kingdom 
Sector Hospitals 
PI Contribution The Leggett group has provided training to Jon Lartey on nanopore sequencing and metagenomics. We've also collaborated on applications for Research Capability Funding (RCF) and a PhD studentship on the MRC MMB scheme. Both of these were awarded. Through the FTMA, Jon attended the Data Carpentry course at EI.
Collaborator Contribution Jon Lartey collaborated with us on developing research proposals (RCF, PhD and more to come) around his speciality in pre-term labour.
Impact £20k Research Capability Funding award from NNUH to cover PDRA time to generate preliminary data for future grant applications. Institute Development Grant award to cover materials to generate preliminary data for future grant applications. CASE PhD studentship accepted on the MRC MMB scheme. FTMA award to cover data carpentry training costs.
Start Year 2023
 
Description Kromek 
Organisation Kromek Group plc
Country United Kingdom 
Sector Private 
PI Contribution Kromek approached us about applying for DARPA funding for a project to sequence airborne DNA for threat monitoring (see separate grant). We supplied expertise in sequencing, bioinformatics and molecular biology and helped to write a grant proposal.
Collaborator Contribution Kromek have engineering expertise and have previously delivered a DARPA radiation detector contract.
Impact Successful application of funding to DARPA. Continued discussions on future work.
Start Year 2018
 
Description Machine Intelligence Garage - Digital Catapult 
Organisation Digital Catapult
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution EI will be contributing to this programme by supporting SMEs in the life sciences sector with expertise in how to enable the most efficient way to process large and complex data, provide repository platforms for data and software distribution, publication, and large-scale data visualisation.
Collaborator Contribution Machine Intelligence Garage programme provides support to SMEs who are seeking access to expertise and computational power to accelerate their growth (develop new products and services).
Impact EI signed an MoU with Digital Catapult Machine Intelligence Garage https://www.migarage.ai/about-mi-garage/ to support their programme.
Start Year 2020
 
Description Molecular Medicine Catapult - Psychiatry Consortium 
Organisation Medicines Discovery Catapult
Country United Kingdom 
Sector Private 
PI Contribution As part of the Psychaitry Consortium my group will be leading on all the bioinformatic analyses for the project aiming to identfy transcripts with tissue specific expression for a primary candidate in neuropsychaitric disorder. The aim it be able to identify proteins with brain specific expressions that can be used as targets for frug development
Collaborator Contribution The University of Oxford leads on the molecular work, generating the data that are analysed by my group. As industrial partners, lead scientists at Boehringer Ingelheim and Biogen provide feedback on their needs, the selection of transcripts to pursue, further analyses.
Impact The project started n December 1st 2020, it is too early to list outputs. The major outcome is the funding to support the work as part of the project (£120,000). The collaboation is highly disciplinary inculding molecular biology, neurobiology and bioinformatics.
Start Year 2020
 
Description The 200 mammals project: sequencing genomes by a novel cost-effective method, yielding a high resolution annotation of the human genome. 
Organisation Broad Institute
Country United States 
Sector Charity/Non Profit 
PI Contribution Part of the consortium which is analysing the data.
Collaborator Contribution Proposal submitted to NHGRI and funded
Impact In combination with the ~50 already existing high quality placental mammalian assemblies, the project produced the sequence of one placental mammal per family for a total of 150 species. The new assembly method: DISCOVAR de novo, was used to allow the production of a good quality novel genome assembly using only a single sequencing library type.
Start Year 2016
 
Description UEA soil metagenomics 
Organisation University of East Anglia
Department School of Pharmacy
Country United Kingdom 
Sector Academic/University 
PI Contribution We are developing bioinformatic pipelines for comparative analysis of complex microbial communities in cropland soils to evaluate the impact of pesticide treatments on soil's services (e.g. long term fertility).
Collaborator Contribution The partners provided the data generation costs, including farm access negotiations, fieldwork and data sequencing.
Impact An ongoing pilot feasibility study on new techniques for soil quality analysis.
Start Year 2020
 
Description University of Los Andes, Universy del Rosario and Earlham Institute (RAEng partnership) 
Organisation Del Rosario University
Country Colombia 
Sector Academic/University 
PI Contribution This is a consortium to promote capacity building in data science in Colombia. Funding was awarded by the Newton Fund - Royal Academy of Eng.
Collaborator Contribution This is a consortium to promote capacity building in data science in Colombia. Funding was awarded by the Newton Fund - Royal Academy of Eng.
Impact two training workshops held at Uniandes campus in 2019 one Data Science summer school in 2020
Start Year 2020
 
Description University of Los Andes, Universy del Rosario and Earlham Institute (RAEng partnership) 
Organisation University of the Andes
Country Colombia 
Sector Academic/University 
PI Contribution This is a consortium to promote capacity building in data science in Colombia. Funding was awarded by the Newton Fund - Royal Academy of Eng.
Collaborator Contribution This is a consortium to promote capacity building in data science in Colombia. Funding was awarded by the Newton Fund - Royal Academy of Eng.
Impact two training workshops held at Uniandes campus in 2019 one Data Science summer school in 2020
Start Year 2020
 
Description Wheat Information System (WheatIS) 
Organisation Cold Spring Harbor Laboratory (CSHL)
Country United States 
Sector Charity/Non Profit 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation EMBL European Bioinformatics Institute (EMBL - EBI)
Country United Kingdom 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation French National Institute of Agricultural Research
Department INRA Versailles
Country France 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation Helmholtz Association of German Research Centres
Department Helmholtz Zentrum Munchen
Country Germany 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation International Centre for Maize and Wheat Improvement (CIMMYT)
Country Mexico 
Sector Charity/Non Profit 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation Monogram Network
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation Rothamsted Research
Country United Kingdom 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation U.S. Department of Agriculture USDA
Department Agricultural Research Service
Country United States 
Sector Public 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation University of Bristol
Country United Kingdom 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation University of California, Davis
Department UC Davis College of Biological Sciences
Country United States 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Description Wheat Information System (WheatIS) 
Organisation University of Western Australia
Country Australia 
Sector Academic/University 
PI Contribution The Grassroots infrastructure (https://grassroots.tools) developed at EI is being used to consolidate data and analyses, facilitating consistent approaches to generating, processing and disseminating public wheat datasets. The Grassroots infrastructure comprises: a data management layer to provide structure to unstructured filesystems; interfaces to interact with local or cloud-based analysis platforms; a search layer to provide multi-faceted metadata and literature querying; a web server layer to deliver content and provide access to public programmatic interfaces. EI has an extensive National Capability to provide scientific computing hardware to the UK research community and is therefore perfectly positioned to build a point-of-access to previously disparate resources to serve wheat breeders, biologists and bioinformaticians. Coupling the Grassroots project with BBSRC-funded efforts to bring Galaxy and CyVerse UK to UK researchers provides community standardised methodologies for data integration, interpretation and discovery in wheat. These resources are designed to be queried programmatically, and we are integrating them with other WheatIS resources (such as CerealsDB) accordingly via open source and freely available infrastructure. By doing so we will be promoting and facilitating an inclusive and collaborative community of experts to provide access to an interconnected network of wheat data to a scale that was simply not available previously. EI also has representation on the WheatIS Expert Working Group, meeting yearly at PAG to discuss strategy and policy for the Wheat Initiative.
Collaborator Contribution All WheatIS partners contribute to the global effort in harmonising, standardising, and sharing wheat data in a way that is technically sensible and user focused, thus minimising cost across a multi-faceted and independently funded project.
Impact This collaboration is multi-disciplinary in scope, undertaken by biologists, bioinformaticians, and breeders. Wheat Data Interoperability Guidelines - https://ist.blogs.inra.fr/wdi/
Start Year 2011
 
Title Air-Seq - UK patent application 1 
Description Method and Apparatus for Detecting Pathogens 
IP Reference GB2200151.5 
Protection Patent / Patent application
Year Protection Granted 2022
Licensed No
Impact Option to license to a UK company Kromek PLC for exploitation in military and defence lapsing in April 2023. Discussing commercialisation via a spin-out for a range of applications from agriculture, to healthcare, to defence and security.
 
Title Air-Seq - UK patent application 2 
Description Methods for extraction and sequencing of nucleic acid 
IP Reference GB2210963.1 
Protection Patent / Patent application
Year Protection Granted 2022
Licensed No
Impact Option to license to a UK company Kromek PLC for exploitation in military and defence lapsing in April 2023. Discussing commercialisation via a spin-out for a range of applications from agriculture, to healthcare, to defence and security.
 
Title Air-seq IP 
Description Methods for DNA extraction, sequencing and analysis of aerosol samples. 
IP Reference  
Protection Protection not required
Year Protection Granted 2018
Licensed Yes
Impact As part of the DARPA funding (see funding), we are licensing our Air-seq technology to Kromek to build devices for biological threat monitoring.
 
Title CROPQUANT 
Description Data processing of images of a crop 
IP Reference  
Protection Trade Mark
Year Protection Granted 2019
Licensed No
Impact The inventors have moved on from EI and there is limited potential for exploitation and impact realisation within EI. The patent was assigned to the National Institute for Agricultural Botany in August 2020. If any future impacts including licensing revenuew will be realised by NIAB, and any potential revenues will be shared with the Earlham Institute.
 
Title DATA PROCESSING OF IMAGES OF A CROP 
Description A method of processing images of a crop, particularly a cereal crop, is described. The method comprises retrieving a series of images of a crop (3; Fig. 2) captured over a period of time and identifying, in an image selected from the series of images to be used as reference image, a reference system against which other images can be compared, the reference system including an extent of a crop plot (111; Fig. 23) and/or one or more reference points, such as height markers (107; Fig. 21). The method also comprises, for each of at least one other image in the series of images, calibrating the image using the reference system, and determining a height of a canopy of the crop in the image, a main orientation of the crop and/or a value indicative of vegetative greenness. 
IP Reference WO2018234733 
Protection Patent application published
Year Protection Granted 2018
Licensed No
Impact The inventors have moved on from EI and there is limited potential for exploitation and impact realisation within EI. The patent was assigned to the National Institute for Agricultural Botany in August 2020. If any future impacts including licensing revenue will be realised by NIAB, then any potential revenues will be shared with the Earlham Institute.
 
Title DATA PROCESSING OF IMAGES OF A CROP 
Description Data processing of images of a crop 
IP Reference EP3642792 
Protection Patent application published
Year Protection Granted 2020
Licensed No
Impact This patent application was bandoned in favour of PCT patent application.
 
Title Data Processing of images of a crop 
Description A method of processing images of a crop, particularly a cereal crop, is described. The method comprises retrieving a series of images of a crop captured over a period of time, selecting from the series of images one to serve as a reference and identifying in said image a reference system against which other images can be compared, the reference system including an extent of a crop plot and/or one or more reference points, such as height markers which may be a ranging pole. For each of at least one other image in the series of images, calibrating the image using the reference system, and determining a height of a canopy of the crop in the image, a main orientation of the crop and/or a value indicative of vegetative greenness. A quality measure may be made of the images in the sequence, possibly involving brightness, size of image file, sharpness or the proportion of dark areas in the image, to determine if an image is to be included in the images to be processed. The image capture device may be a camera and the images may be transferred to a processing unit via a wireless network with only images meeting the quality requirement being transferred. 
IP Reference GB2553631 
Protection Patent granted
Year Protection Granted 2018
Licensed Yes
Impact It was formally licensed to Nanjing Agricultural University in China, including for commercial use in South East Asia. It was licensed for commercial use in South East Asia, but we don't have any further reports on additional impacts. Filed applications for European and Chinese patents in December 2019. https://worldwide.espacenet.com/publicationDetails/biblio?DB=EPODOC&II=0&ND=3&adjacent=true&FT=D&date=20180314&CC=GB&NR=2553631A&KC=A
 
Title API for programmatic profile creation and manifest validation 
Description API for programmatic profile creation and manifest validation 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact REST webservice can be used for creation of profiles and validation of manifests 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title ASG Submissions in COPO 
Description Implementing pipeline to create and submit ASG type samples 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact samples from the ASG group can now be contributed to the DTOL project 
 
Title ASG submission pipeline 
Description The Darwin Tree of Life pipeline for metadata validation and samples submission to ENA was used as a baseline to deploy a similar pipeline for the aquatic Symbiont Genome Project. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact In the first 10 months over 2700 ASG samples were submitted trough COPO 
URL http://copo-project.org
 
Title Adaptive Sequencing Enrichment model - hosting on the CyVerse UK e-infrastructure 
Description A shiny web application that provides an interactive implementation of a nanopore adaptive sequencing model. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact Enables users to plan their own adaptive sequencing experiments. 
URL https://readuntil.cyverseuk.org
 
Title Aequatus 
Description Aequatus Browser is an open-source web-based tool developed at EI to visualise homologous gene structures among differing species or subtypes of a common species. Aequatus uses the Ensembl Compara and Core database schemas to store comparative information between organisms. Aequatus uses precalculated gene family information and genomic alignments data in the form of CIGAR strings, from Ensembl Compara or the GeneSeqToFamily pipeline, and cross-references these sequences to Ensembl Core databases for each species to gather genomic feature information via stable_ids. Aequatus then processes the comparative and feature data to provide a visual representation of the phylogenetic and structural relationships among the set of chosen species. The ultimate goal of the Aequatus Browser is to provide a unique and informative way to render and explore complex relationships between genes from various species at a level that has so far been unrealised. 
Type Of Technology Webtool/Application 
Year Produced 2017 
Open Source License? Yes  
Impact New developments to Aequatus have been..... 
URL http://aequatus.earlham.ac.uk/
 
Title Alvis 
Description Tool to produce a range of production quality alignment diagrams based on the output of common aligner tools. Will also spot chimeric contigs/reads. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact An earlier version was used to create diagrams for the RenSeq paper. The tool is currently being used for analysis of genome assembly quality within EI. A paper on the tool will be submitted during 2019. 
 
Title Annotation of PDFs and Spreadsheets 
Description Tools for annotating spreadsheet documents and pdfs with ontology terms from the OLS updated and in some cases implememented. 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact Rich metadata can be added and shared to otherwise simple tabular or textual data files. 
 
Title Assembly Submission Pathway 
Description A new assembly submission pathway has been created and added to the COPO codebase. This allows genomic assemblies to be submitted into the ENA. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact This will allow users to more quickly and easily make their assemblies public. 
URL http://copo-project.org
 
Title Automatic Software Builds 
Description COPO is now automatically built by Dockerhub when submitting a correctly formatted tag to Github. Notifications are sent from Dockerhub to Slack/Discord when the build completes, which means its now extremely rapid to deploy new code to all COPO servers. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Bug fixes and new features can be implemented and deployed to COPO live services within 10-20 mins. 
 
Title CGCore Schema 
Description Schema is a template designed to collect all metadata relating to research outputs produced by the CGIAR institutes. Our group helped design the specification. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact This schema is being deployed to the CGIAR centers imminently. Once done, it will form the basis of data collection for 15 research centers around the globe employing in excess of 8000 scientists. 
 
Title CGCore Wizard 
Description Based on the CGCore specification, the wizard is an implementation of the template in the COPO platform. It enables researchers to actually record the metadata relating to CGIAR research objects. 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact Researchers from the CGIAR institutes will be able to record their metadata based on the CGCore schema. CGIAR encompasses 15 institutes worldwide employing over 8000 researchers. 
URL https://copo-project.org/copo
 
Title CKAN Workflow for data deposition 
Description In collaboration with the CGIAR centers, we developed a workflow for depositing heterogeneous data into the CKAN repository along with appropriate metadata extracted from CGCore metadata fragments. CKAN is an open source, mature federated solution for storing, sharing and disseminating data objects and metadata. 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact This will allow COPO to be the main access point of metadata annotation and data deposition for the CGIAR institutes. This is a major conglomeration of research stations around the globe responsible for many agricultural advances in the developing world since the end of the second world war. From this work, we can expect to broker tens of thousands of documents, data sets or other research objects from CGIAR researchers. 
 
Title COPO - release of real time stats and histogram 
Description for ease reporting and assessment of impact graphs and stats are now visible in the COPO website 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact easier reporting to board and in presentation 
 
Title COPO - release of validation and pipeline for DTOL SOP version 2.3 and ASG 2.3.1 
Description COPO was updated to successfully validate the new version (2.3 for Darwin Tree of Life and 2.3.1 for Aquatic Symbiont Genome project) of manifests 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact DTOL and ASG communities successfully updating their SOPs and using COPO as a metadata validation, metadata storing and submission broker 
 
Title COPO - samples updates for ERGA and acceptance of rejected samples 
Description ERGA users can amend previously uploaded manifest as long as validation pass. DTOL/ASG and ERGA managers are also now able to accept previously rejected samples (that may have been rejected by mistake or that have been amended and became acceptable). 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact The ability for users to perform updates, albeit only in some scenarios hugely reduced the burden both on the COPO team and the ERGA coordinators in dealing with custom support requests. Similarly the possibility for managers to correct errors from the interface reduce the number of custom requests of support to the COPO team, that can instead develop new features 
URL http://copo-project.org
 
Title COPO - taxonomy edge cases validation 
Description COPO validation of DTOL/ASG manifests was updated to take into account edge cases in species for which NCBI doesn't return a genus, a family or either 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact This has allowed to successfully return errors for manifest not allowed to be submitted in the context of the DTOL and ASG projects, and to successfully validate samples from taxa that do not have an allocated family 
URL http://copo-project.org
 
Title COPO -ERGA manifest validation and submission 
Description COPO was adopted as a validation and submission metadata broker by the ERGA (European Reference Genome Atlas) community. Separata profile type, user group and validation for the project were developed and released 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact The ERGA community has been using COPO since January and received funding for further research having been able to prove the feasibility of the project 
URL http://copo-project.org
 
Title COPO API 
Description HTTP rest methods for accessing sample information. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact This is the mechanism by which the Sanger STS pulls sample information, which is then used by many other DToL partners. 
 
Title COPO Stats Generator 
Description Daily stats for COPO samples, profiles, data files and users are collected, stored and available as lists in the API 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact up to date figures and graphs are always available for COPO reporting 
 
Title COPO developer documentation 
Description Extensive development documentation has been created or updated descrihbing how to deploy and use the COPO system. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact Allows potential system administrators to deploy an instance of COPO themselves. Also required as comments in a paper under review expressed that there was not enough information to deploy COPO. 
 
Title Collaborative Open Plant Omics (COPO) 
Description COPO streamlines the process of data deposition to public repositories by hiding much of the complexity of metadata capture and data management from the end-user. The ISA infrastructure (www.isa-tools.org) is leveraged to provide the interoperability between metadata formats required for seamless deposition to repositories. COPO facilitates the links to data analysis platforms such as CyVerse UK and Galaxy. Logical groupings of artefacts (e.g. PDFs, raw data, contextual supplementary information) relating to a body of work are stored in COPO collections and represented by common standards, which are publicly searchable. Bundles of multiple data objects themselves can then be deposited directly into public repositories through COPO interfaces. This improvement output represents the beta release of the COPO platform in 2017. 
Type Of Technology Webtool/Application 
Year Produced 2017 
Open Source License? Yes  
Impact COPO has been added to the ELIXIR-UK roadmap for ELIXIR core data services, and is currently being used by EI and JIC researchers to deposit real, large scale sequencing datasets into the European Nucleotide Archive. COPO is also being investigated as a potential data entry tool for the CGIAR Big Data project, and this will be explored in a joint EAGER submission with CIMMYT. COPO has also been selected to act as one of the data ingestion pipelines for data arising from the Designing Future Wheat programme, depositing open data into the Grassroots repository. COPO is also being included in grant submissions to assist vertebrate and wheat communities in effective metadata management. COPO runs within the CyVerse UK National Capability infrastructure. 
URL https://copo-project.org
 
Title Contribution to Image Processing material 
Description Added figures and description of 2D Gaussian blur 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact participants will hopefully have a better understanding of how this technique works. 
URL https://datacarpentry.org/image-processing/06-blurring/index.html
 
Title Creation of Deployment architecture 
Description The deployment architecture is based on Docker Swarm hosted on the CyVerse UK virtual infrastructure. This provides a robust and dynamic deployment system allowing load balancing between virtual servers, whilst providing convenience and security. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact COPO has over 99% uptime meaning that our users are provided with a reliable and accessible service around the globe and around the clock. 
URL https://copo-project.org/copo
 
Title Creation of MIAPPE wizard 
Description Schema implemented in wizard format in COPO to allow collection of plant phenotyping experimental metadata. 
Type Of Technology Webtool/Application 
Year Produced 2017 
Open Source License? Yes  
Impact Users can now record their metadata in this format. Phenotyping experiments are an important part of crop science, and to be able to collect full metadata in such a manner as this is very important. 
 
Title Creation of institutional repos architecture 
Description This piece of work allows for the deposition of items to institutional repository instances. This is important since researchers are increasingly looking to looking to smaller scale institutionally hosted instances of off the shelf repository solutions such as Dataverse, CKAN and DSPACE. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact Allows users to enter details of their institutional repo instances and have many users submit there. 
 
Title Creation of shared profile architecture 
Description This allows users to share profiles with other users of their choice to facilitate group editing of metadata. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact This was a much requested feature to allow collaborations between disparate research groups. 
 
Title DFW cloud HPC resources 
Description Designing Future Wheat researchers are able to request virtual machines within CyVerse UK to undertake bioinformatics analysis. 
Type Of Technology Grid Application 
Year Produced 2019 
Impact We have produced a robust and secure cloud framework within CyVerse UK to allow DFW researchers to access DFW and public data to analyse, as well as upload their own. We have already completed two successful pilot projects with external collaborators, and are now making the services available to all DFW researchers. 
URL http://cyverseuk.org/about/collaborations/designing-future-wheat/
 
Title DSPACE repository deposition workflow 
Description In collaboration with the CGIAR centers, we developed a workflow for depositing heterogeneous data into the DSpace repository along with appropriate metadata extracted from CGCore metadata fragments. DSpace is developed by DuraSpace and is widely used in academia to deposit and disseminate research objects. 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact This will allow COPO to be the main access point of metadata annotation and data deposition for the CGIAR institutes. This is a major conglomeration of research stations around the globe responsible for many agricultural advances in the developing world since the end of the second world war. From this work, we can expect to broker tens of thousands of documents, data sets or other research objects from CGIAR researchers. 
 
Title DToL Supervisor View 
Description DToL Supervisors can see samples which are pending, accepted or rejected. Pending samples can be accepted or rejected. If accepted, a background task compiles all samples to be accepted, sends them to ENA for biosampling and then makes them available via the API for services such as Sanger STS. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact This is an integral part of the DToL workflow, and allows supervisory oversight of sample metadata before it enters the DToL ecosystem 
 
Title Deposition workflow to Dataverse Repository 
Description In collaboration with the CGIAR centers, we developed a workflow for depositing heterogeneous data into the Dataverse repository along with appropriate metadata extracted from CGCore metadata fragments. Dataverse is developed by Harvard university and is widely used in academia to deposit and disseminate research objects. 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact This will allow COPO to be the main access point of metadata annotation and data deposition for the CGIAR institutes. This is a major conglomeration of research stations around the globe responsible for many agricultural advances in the developing world since the end of the second world war. From this work, we can expect to broker tens of thousands of documents, data sets or other research objects from CGIAR researchers. 
 
Title Dublin Core Integration 
Description Dublin core schema and implementing wizard created allowing researchers to record metadata for their outputs in this community standard format. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact Some users have reported using this feature. Dublin core is a well recognized community standard used by many data producers and repositories. 
 
Title ENA sequence read submission manifest 
Description ENA sequence submissions can be carried out by filling out a spreadsheet and parsing the required metadata out of it 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact This should make submissions easier and more reliable, by allowing users to use spreadsheet software to fill out large tables of data. 
 
Title Filtering of manifest view by sequencing center 
Description Users can select and filter which sequencing center their data will be handled by 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact ease of use by sample managers 
 
Title Front-end efficiency improvements 
Description front-end pagination of lists in the Accept/Reject view have greatly increased the responsiveness and usability of this view. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact usability of COPO improved for sample supervisors 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title Gene Align and Family Aggregator 
Description Gene Align and Family Aggregator (GAFA) generates an SQLite database that can be visualised with Aequatus, an open-source homology browser developed with novel rendering approaches to visualise homologous, orthologous and paralogous gene structures. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Since the upload to the Galaxy ToolShed, this tool has been cloned / installed 31 times 
URL https://toolshed.g2.bx.psu.edu/view/earlhaminst/gafa/
 
Title GeneSeqToFamily 
Description GeneSeqToFamily represents the Ensembl Compara pipeline as a set of interconnected Galaxy tools, so they can be run interactively within the Galaxy's user-friendly workflow environment while still providing the flexibility to tailor the analysis by changing configurations and tools if necessary. Additional tools allow users to subsequently visualise the gene families produced by the workflow, using the Aequatus.js interactive tool, which has been developed as part of the Aequatus software project. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact This new workflow has been used for a number of research projects at the Earlham Institute, including the investigation of gene families for the koala genome, in collaboration with Kathy Belov, Wilfried Haerty, Will Nash and Federica di Palma. The workflow was published as a preprint on bioRxiv in 2017 and subsequently accepted into the GigaScience journal in 2018. 
 
Title General updates to user documentation 
Description Updates to user documentation to allow users to solve their own queries 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact Less time spent on dealing with user tickets, more time on development. 
URL https://github.com/collaborative-open-plant-omics
 
Title Image uploads 
Description A pipeline to upload associated image files from specimens to the Bioimage Archive. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact ToL users are now able to broker images to Bioimage archive through COPO 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title Implementation of DToL SOP v2.4 
Description COPO implementation of 2.4 SOP, including validation, ingestion and submission. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact Collectors and curators can now submit manifest metadata against v2.4 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title Integration Testing 
Description Integration test using the Selenium framework cover many of the major peices of functionality in COPO. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact More robustness 
URL https://github.com/collaborative-open-plant-omics
 
Title Integration with red list sample services 
Description COPO will query services such as EUHabitat. Is a sample is protected, COPO will ensure that the sample curator views the collection permit. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact ToL samples will be in compliance. 
 
Title Legopore 
Description Software to control the Lego DNA sequencer model built for public engagement. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact Hundreds of members of the public, adults and children, have used the software at one of our engagement events. 
 
Title Local Contexts Hub Integration 
Description COPO can query local contexts hub service for traditional knowledge statements 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact Members of indigenous groups can protect their intellectual property whilst still contributing to ToL projects. 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title MARTi demonstration server - hosting on the CyVerse UK e-infrastructure 
Description MARTi is software for real-time analysis of metagenomic sequence data. Cyverse UK hosts two instances of MARTi - one is a demonstration server for potential users to try out, one is a development server for sharing analysis with key collaborators. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact Requests for access to the software. 
URL http://marti.cyverseuk.org
 
Title MISO: An open-source LIMS for NGS sequencing centres 
Description MISO (Managing Information for Sequencing Operations) is an open-source Lab Information Management System (LIMS) started at the Earlham Institute, specifically designed for tracking next-generation sequencing experiments. Sequencing centres differ not only in terms of their scale and output, but also their requirements for information management. Sequencing platforms are becoming more accessible, and the efficient storage of genomic metadata is vital for large and small sequencing centres alike. Off-the-shelf solutions are often very expensive and not cost-effective for the smaller centre. Furthermore, support contracts are often required, and the extensibility of these systems is not in the hands of the metadata generators. In terms of implementation, as well as the desire to tailor an information system in-house, data formats can change and platforms can evolve rapidly. These are valid concerns for both large centres characterised by high-throughout data production and smaller scale laboratories with constrained expenditure for IT solutions, and potentially project specific metadata requirements. Hence, we are developing MISO, an open-source LIMS for recording sequencing metadata. We are using freely available tools that are industry standard, well documented, and easy to set up on minimal hardware. As a bare system, MISO can store relevant metadata based on a wide array of NGS sequencing platforms (e.g. Illumina GA, HiSeq and MiSeq, Roche 454, ABI SOLiD and PacBio RS) and public repository data submission schemas (e.g. the Sequence Read Archive at the EBI), and has many features common to bespoke and proprietary LIMS, such as secure authentication, fine-grained access control, barcode tracking, and reporting. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact A number of new MISO releases have been made available this year, the latest being 0.2.109. The project has seen over 140 releases in its 7 year existence, and the project is 100% open source, allowing sequencing centres to have the option of a completely free solution for managing their instruments and sample tracking. 
URL https://github.com/TGAC/miso-lims/releases/tag/v0.2.109
 
Title Manifest Wizard 
Description The COPO manifest wizard allows users to select and prepopulate fields from a range of ToL manifests, then download the manifest, and reupload for validation/deposition. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact This allows user to always download the most up to date and compliant ToL manifest, whilst have any fields pre-populated such as date / location fields. 
URL http://copo-project.org
 
Title Multiple Celery Queues to prevent concurrency issues 
Description Multiple Celery Queues to prevent concurrency issues 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact This should prevent bug where multiple instances of the same sample were submitted to ENA 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title NanoOK RT software tool 
Description A real-time analysis tool for metagenomic classification and identification of antimicrobial resistances form nanopore sequence data. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact The software was developed for and initially used in our work with pre-term babies suffering from Necrotizing Enterocolitis. We are now working to apply it to a wide range of other application areas and have had discussions with a number of interested parties at national and international institutes. 
 
Title New API endpoint 
Description API endpoint allows programattic filtering of samples by project type 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact More useful sample queries for downstream developers 
URL https://github.com/collaborative-open-plant-omics
 
Title New Graphical Dashboard 
Description AP wrote a dashboard which is able to view dtol sample metadata in COPO, faceting it by any of the fields collected, and summarising it in various plots. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact Researchers will be able to quickly and easily see visual summaries of the data we have collected. 
URL http://copo-project.org
 
Title Parsing and Validation of ASG Sample Manifests 
Description Automatic parsing, validation and ingestion of ASG (Aquatic Symbiosis Genomics) manifests is now possible. Around 10 different validations are performed on every cell of a manifest (which can easily amount to tens of thousands of cells for a single manifest) 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Integral to the DToL Project, COPO is the entry point for all Sample metadata. 
 
Title Parsing and Validation of DTOL Sample Manifests 
Description Automatic parsing, validation and ingestion of DTOL manifests is now possible. Around 10 different validations are performed on every cell of a manifest (which can easily amount to tens of thousands of cells for a single manifest) 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Integral to the DTOL project, COPO is the entry point for all sample data 
 
Title Permit Collection 
Description Mandatory submission of permits required based of manifest fields. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact COPO is in compliance with EU law regarding species collection permits 
URL https://github.com/orgs/collaborative-open-plant-omics/dashboard
 
Title Pipeline for long non coding RNA identification 
Description Pipeline enabling long non coding identification taking as input RNA-Seq data, genome sequence, and annotation. The pipeline handle the read mapping, de noo transcript assembly, comparison with existing annotations to identify novel intergenic transcript, assess the coding potential of the noevl transcripts report those identified as non coding and at least 200 nt long. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact Long non codig RNAs are generally associated with gene expression regulation and previouly identified to have major roles in both animals and plant biology. The pipeline is organism agnostic, enabling the identification of lon non coding RNAs in any eukaryotic species. The aim is to make the pipeline available on Galaxy for the community to use. 
URL https://github.com/TGAC/lncRNA-analysis
 
Title PredictingCircadianTime 
Description We developed a ML based pipeline to predict the circadian time (phase) at any single transcriptomic sampling time point using gene expression data from a set of marker genes. This code uses an artificial neural network (with Tensorflow). 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact https://doi.org/10.1101/2021.02.04.429826 This work was done in collaboration with IBM and is being taken forward in a collaboration with the Alan Turing Institute 
 
Title Project Modularisation 
Description COPO validations and schema field lists are now stored in versioned directories, making rolling back and forth between versions possible. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact it should be possible to validate older versions of the manifests 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title Refactoring TOL validations 
Description Validations now carried out in a celery task 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact COPO servers are now more available 
 
Title SDG 
Description SDG is a framework to analyse sequence graphs such as those generated by various genome assemblers. It provides a workspace that can contains a graph and datastores for paired, linked and long reads. These reads can be mapped to the graph, and can be used to untangle or scaffold the graph. A SWIG API enables SDG to be used as a Python module, and there is experimental Julia and R support. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact We are currently producing genome assemblies of: multiple wheat cultivars, multiple strawberry cultivars, and more. 
URL https://f1000research.com/articles/8-1490
 
Title SKM-tools 
Description These are a series of tools to compare skip-mers (cyclic spaced-seeds) spectra between different datasets. It can be used to study conservation of sequence across evelotuonary distant organisms. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact We are using skm-tools to study conservation in the context of EI's CSP and BBSRC's DFW projects. 
URL https://github.com/bioinfologics/skm-tools
 
Title SMART Domains 
Description Search domains in protein sequences using SMART 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Since developed this tool has been cloned / installed for 7 times, we are expecting more usage once its become part of GeneSeqToFamily workflow. 
URL https://toolshed.g2.bx.psu.edu/view/earlhaminst/smart_domains/
 
Title SOP 2.3 
Description COPO validations conform to 2.3 version of TOL manifest 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact Data in the TOL projects is increasingly FAIR 
 
Title Schema Manifest Wizard 
Description A wizard which allows multiple samples and fields to be prepopulated and downloaded as a manifest to complete and upload. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact This should make it easier and quicker for curators to fill out new manifests 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title Selenium Test Suite 
Description Selenium is a testing solution for remote controlling of a web browser. This is required for unit testing of front-end javascript code. Selenium unit tests have been written for several of the major deposition pathways within COPO. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact COPO is more robust and less likely to incur downtime from undicovered bugs. 
URL http://copo-project.org
 
Title Software Architectural Design of COPO 
Description I have been responsible for the overall design and delivery of many new features in COPO over the past 12 months. Please see recent software output entries for further details. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2023 
Open Source License? Yes  
Impact The overall growth of COPO as a service 
 
Title Submission of DToL samples 
Description Pipeline for submitting sample metadata to ENA implemented. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact This allows public access to DToL data, fundamental goal of the project. 
 
Title Suite Ensembl REST 
Description A suite of Galaxy tools designed to work with Ensembl REST API. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Since developed this tool has been cloned / installed for 44 times. 
URL https://toolshed.g2.bx.psu.edu/view/earlhaminst/suite_ensembl_rest/
 
Title Swagger code for Validation API 
Description Swagger documentation for validation API 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact Allows users to learn and experiment with COPO API 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title Swagger for API 
Description Swagger is a documentation/experimentation framework for rest apis. It allows users to both examine and test in real time, an api service. This has been implmented for COPO. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Self documentation and testing of COPO api methods. Has allowed end users to get up to speed with our api quickly and also allowed bugs to be conveniently reported to us with examples. 
URL http://copo-project.org/api
 
Title TAQLORE - annotation and quantification of transcripts from long read sequencing 
Description The pipeline uses long read RNA amplicon sequencing to annotate and quantify transcripts. The pipeline mappes the reads to a genome, identify novel exons and splice sites, reconstruct transcripts, quantify and normalise transcript expression for visualisation. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact The application of this pipeline is the basis of a publication (Clark et al 2020 Mol Psychiatry) and the developments as part of this research lead to further funding as a partnership between academia and industry (Medicine Discovery Catapult, Psychiatry Consortium). 
URL https://taqlore.readthedocs.io/en/latest/
 
Title The Grassroots Infrastructure 
Description The Grassroots software is an open source "as-a-Service" stack that powers a number of data dissemination and analysis activities at EI, and other sites such as CerealsDB at the University of Bristol. We have continued to develop the functionality within the software stack to share crop-related datasets. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact Grassroots has previously been used to host the Field Pathogenomics project website and Yellow Rust map, the EI wheat BLAST service, the CerealsDB federation project, and the multi-scale improvements to the Polymarker marker design software. Recently, Grassroots has been put forward as the main data repository and metadata catalogue for the Designing Future Wheat project, and has started to host data from this project, the Open Wild Wheat Consortium, and 5 new wheat genomes from EI. The Grassroots service runs within the CyVerse UK National Capability infrastructure. 
URL https://grassroots.tools/
 
Title TreeBeST best 
Description Developed Galaxy wrapper for TreeBeST 'best'. TreeBeST is a tool to generate a phylogenetic tree using CDS alignment and species tree. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Since developed this tool has been cloned / installed for 31times. 
URL https://toolshed.g2.bx.psu.edu/view/earlhaminst/treebest_best/
 
Title Update pipeline 
Description Updating of existing samples 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact TOL users can now update samples under certain conditions (mainly that samples have not been submitted to ENA) 
 
Title Updated COPO frontpage 
Description New modern design implemented. Designed by our in-house graphic designer and implemented by the COPO development team, COPO now has a fresh look, and shows realtime relevant numbers on the frontpage 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact potential users can now see how much data copo handles. 
URL http://copo-project.org/
 
Title Updated Deployment Pipeline 
Description DK updated our Docker deployment process to make configuring all parts straightforward. A YAML file now contains everything in a single place, and describes images to be used, hardware configs, volumes, networks, secrets etc. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact Less downtime for researchers, and easier deployments for developers. 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title Updated to ENA REST V2 
Description Updated COPO to use the newest version of ENAs REST API. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact This will make the submission process more robust as any problems in submission will be automatically rolled back. 
URL https://github.com/collaborative-open-plant-omics/COPO
 
Title Various field name changes and validation rule changes 
Description dealing with general churn of manifests including field name changes, new fields and validations 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact This is ongoing maintenance which is required to keep COPO up to date and useful 
URL https://github.com/collaborative-open-plant-omics
 
Title Web Guide to COPO 
Description Visual guide with full screenshots of how to submit various datatypes to COPO 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact Allows users to get up and running with COPO. 
 
Title aequatus.js 
Description Aequatus.js is an open-source JavaScript library to visualise homologous gene structures among differing species or subtypes of a common species. It is developed as part of Aequatus project. 
Type Of Technology Webtool/Application 
Year Produced 2017 
Open Source License? Yes  
Impact This development of Aequatus.js allows its implementation to any web based tools without relying on full Aequatus software. One of example is implementation within Galaxy Server at EI. 
URL https://github.com/TGAC/aequatus.js
 
Title ete 
Description Analyse phylogenetic trees using the ETE Toolkit 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Since developed this tool has been cloned / installed for 33 times. 
URL https://toolshed.g2.bx.psu.edu/view/earlhaminst/ete/
 
Title hcluster_sg 
Description Developed Galaxy wrapper for hcluster_sg, a hierarchically clustering on a sparse graph 
Type Of Technology Software 
Year Produced 2017 
Impact Since developed it's been downloaded/cloned for 70 times. 
URL https://toolshed.g2.bx.psu.edu/view/earlhaminst/hcluster_sg/
 
Title hcluster_sg_parser 
Description Converts hcluster_sg output into separate list of ids 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Since tool is available on Galaxy ToolShed, its been cloned / installed 48 times. 
URL https://toolshed.g2.bx.psu.edu/view/earlhaminst/hcluster_sg_parser/
 
Title interaction with ECS datastore 
Description work has started to interact with EIs ECS datastore, to enable COPO to access large datafiles created by our researchers 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact this will make COPO more useful by hugely reducing the amount of data transfer needed for submissions 
 
Title stats and charts in COPO 
Description D3 based stats and charts created for COPO 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact Users can inspect things like, number of samples, users, files; as well as inspecting metadata chracteristics in tol data 
 
Title tol barcoding 
Description lookup, acceptance and storing of barcoding data for tol samples 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact COPO can interact with and harvest data from the BOLD database to keep an organised collection of barcoding along with sample metadata 
 
Title w2rap 
Description w2rap is a genome assembly pipeline for complex genomes from short reads. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact W2rap has enabled wheat genomics to jump into a new era of high-quality genomes from short reads. While there are some alternative tools from private companies, w2rap remains the standard for quality reconstruction across the genome. W2rap has already been used to assemble 5 wheat genomes in the public domain, putting the UK at the forefront of wheat genomics. With tens of genomes being assembled now, new modules veing developed for new data types, and 5 wheat lines assembled in a £1M private project, w2rap is one of the flagship projects for Earlham Institute. 
URL https://github.com/bioinfologics/w2rap/
 
Description 10 years of impact: Keeping living things healthy - 23/05/2019 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Website article highlighting the impact of our research so far including working delivered through the CSP from a report commissioned by EI (Brookdale Consulting).
Year(s) Of Engagement Activity 2019
URL http://www.earlham.ac.uk/articles/10-years-impact-keeping-living-things-healthy
 
Description 5 ways EI is improving global food security 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact 5 ways EI is improving global food security
Year(s) Of Engagement Activity 2020
URL https://www.earlham.ac.uk/articles/5-ways-earlham-institute-improving-global-food-security
 
Description 600k machine learning collaboration to supercharge data driven science 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact 600k machine learning collaboration to supercharge data driven science
Year(s) Of Engagement Activity 2020
URL https://www.earlham.ac.uk/newsroom/%C2%A3600k-machine-learning-collaboration-supercharge-data-driven...
 
Description A PhD, is it worth it? Just ask our students 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact A PhD, is it worth it? Just ask our students
Year(s) Of Engagement Activity 2020
URL https://www.earlham.ac.uk/articles/phd-it-worth-it-just-ask-our-students
 
Description ARN2: Uncovering the multi-layered regulation of autophagy From Functional Genomics to Systems Biology 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact ARN2: Uncovering the multi-layered regulation of autophagy From Functional Genomics to Systems Biology
Year(s) Of Engagement Activity 2019
 
Description Activity - DNA sequencer at Great Hockham Primary School 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Activity at the Great Hockham Primary School
Year(s) Of Engagement Activity 2019
 
Description Activity - Pink Pigeon Trail at the Norfolk Show 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Activity Pink Pigeon Trail at the Norfolk Show
Year(s) Of Engagement Activity 2019
 
Description Activity - Where have you BEEn? 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Activity at the Norwich Science Festival
Year(s) Of Engagement Activity 2019
 
Description Affordable genome sequencing for pathogen analysis to help tackle global epidemics 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact Blog post for the publication of the development of an affordable sequencing protocol applied to the sequencing of over 10,000 Salmonella isolates
Year(s) Of Engagement Activity 2022
URL https://www.earlham.ac.uk/newsroom/affordable-genome-sequencing-pathogen-analysis-help-tackle-global...
 
Description Agritech East Week 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact In order to stimulate a dialogue and discussion between scientists, farmers, breeders and more, we organised an interactive, show-and-tell style workshop for members of AgriTech East, who came to EI for three hours on a Tuesday evening.

The event showcased the latest advances in genome sequencing and bioinformatics, while then allowing groups to explore themes relevant to modern agriculture and research together.
Year(s) Of Engagement Activity 2018
 
Description Analysing intestinal organoids in a multi-omics, systems biology framework to investigate gut health and host-microbe interactions Scientific Advisory Board Meeting Earlham Institute 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Analysing intestinal organoids in a multi-omics, systems biology framework to investigate gut health and host-microbe interactions Scientific Advisory Board Meeting Earlham Institute
Year(s) Of Engagement Activity 2019
 
Description Attendance at Biocuration 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Attended conference to network and find out latest developments.
Year(s) Of Engagement Activity 2023
 
Description Attendance at the Biodiversity Genomics 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Rob Davey attended Biodiversity Genomics 2020 which brought together researchers across the world to celebrate the achievements in genome sequencing across the eukaryotic tree of life, explored current challenges and their likely solutions, and looked forward to the coming decade of the application of genomics across the globe. With major projects starting to deliver data at scale, new tools for sequencing and assembling genomes becoming available, and increased awareness of the power of whole genome data in understanding organismal biology and ecosystem processes, Biodiversity Genomics 2020 promised to be a milestone in the effort to "sequence life for the future of life".

Continuing the tradition of ambitious, collaborative science
Biodiversity Genomics 2020 continues the tradition of previous meetings of Genomes 10k, the Vertebrate Genomes Project, the Global Invertebrate Genomics Alliance, and the Earth BioGenome Project.
Year(s) Of Engagement Activity 2020
URL https://www.sanger.ac.uk/science/biodiversity-genomics-2020/
 
Description Attendance at the ELIXIR-UK All Hands 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Rob Davey attended the ELIXIR-UK All Hands 2020 to take part in discussions about ongoing and possible future collaborations
Year(s) Of Engagement Activity 2020
URL https://www.earlham.ac.uk/elixir-uk-all-hands-2020
 
Description Attendance at the UK-Conference of Bioinformatics and Computational Biology 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Attendance at the UK-Conference of Bioinformatics and Computational Biology 2020
The UK-CBCB conference is designed to bring together biologists, bioinformaticians, computer scientists, software engineers and data scientists across the life sciences to share innovations, applications and best practice in their fields. Rob Davey, Nicola Soranzo, Alice Minotto attended to participate in discussions and to add impact to the ongoing work in bioinformatics
Year(s) Of Engagement Activity 2020
URL https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-2020#About-
 
Description Attended Gordon Research Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Lightning talk and poster presentation at this conference.
Year(s) Of Engagement Activity 2023
 
Description BAMBI Diagnosing infections earlier in preterm babies with real time genomic analysis - 16/12/2019 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Website article highlighting a new method for profiling the microbiome of preterm babies that can significantly speed-up the identification of infections and indicate more effective treatments developed in collaboration with Quadram Bioscience and NNUH. News stories highlight important updates that also have broad relevance and interest to the national and/or specialised media. This story generated 293K estimated coverage reads and 298K social shares.
Year(s) Of Engagement Activity 2019
URL https://www.earlham.ac.uk/newsroom/diagnosing-infections-earlier-preterm-babies-real-time-genomic-an...
 
Description Big Data Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Gave talk on data management to PhD students.
Year(s) Of Engagement Activity 2023
 
Description Build, empower and amplify: bioinformatics for agri-research in Africa - 16/07/2018 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact News stories highlight important updates that also have broad relevance and interest to the national and/or specialised media. Collectively, our international reach as of March 2019 has extended across the globe, with highlight pieces in the Guardian, the BBC World Service, the Washington Post and more, as well as local TV and radio. In 2018, the estimated readership of news stories we shared was well in excess of 1 million.
Year(s) Of Engagement Activity 2018
URL https://acaciaafrica.org/build-empower-amplify-bioinformatics-agri-research-africa/
 
Description COPO training sessions for DTOL 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact We run 2 training session on how to use COPO for Darwin Tree of Life submissions.
Year(s) Of Engagement Activity 2020
 
Description COPO training sessions for DTOL - session 2 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact To teach Darwin Tree Of Life researchers how to use COPO and how to annotate effectively their data to submit as part of the DTOL project, answer any question and doubt they may have had
Year(s) Of Engagement Activity 2020
 
Description CW21 + hackaton partecipation 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Collaborative workshop 2021 is a conference for research software engineers. We discussed changes in work pattern and collaborations due to the pandemic and which ones have been beneficial for our jobs, accessibility and diversity was discussed, with a focus on disabilities. We had collaborative ideas about how to improve the RSE landscape and had a hackaton day to tackle the identified issues.
Year(s) Of Engagement Activity 2021
 
Description Cataloging genetic diversity in the Black-footed Ferret Black-footed Ferret Genetics Symposium, Fort Collins, Colorado, USA. 19-20 Jan 15. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation of EI's work on Black-footed Ferret.
Year(s) Of Engagement Activity 2015
 
Description Cheap and robust genomes using the PDABS pipeline Advances in Genome Biology and Technology 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Advances in next-generation sequencing (NGS) technologies and the subsequent reduction in cost have enabled the research community to sequence the genomes of many non-model organisms. Genome assemblies using next generation technologies such as Illumina show high quality nucleotide level information, but are fragmented due to the inability to retain contiguity. The absence of low-cost methods for creating high quality genome sequences from a wide range of organisms currently hampers the generation of de novo genomes for comparative genomic studies, and new methodologies and data types need to be tested to achieve this goal. We have integrated NGS with nano-channel genome mapping and developed the PCR-free Discovar Assembly BioNano Scaffolding (PDABS) pipeline to assemble cheap, contiguous and robust genomes.
Year(s) Of Engagement Activity 2016
 
Description Conference talk - Air-seq: using DNA sequencing to provide early warning of airborne crop disease (REAP 2020) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Agri-TechE REAP Conference 2020: From micro-scape to landscape - Innovating at the frontier
Year(s) Of Engagement Activity 2020
URL https://www.agri-tech-e.co.uk/event/reap-conference-2020/
 
Description Conference talk: Tackling microbial diseases using MinION diagnostics and NanoOK 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited plenary speaker at the Bern Nanopore Technology Day.
Year(s) Of Engagement Activity 2017
 
Description Cyberinfrastructure and the Carpentries - Workshop in Bogota, Colombia 2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact We delivered a Carpentries instructor training in Bogota, Colombia, as part of the GROW Colombia project.
Year(s) Of Engagement Activity 2018
URL https://froggleston.github.io/2019-10-22-ttt-colombia/
 
Description Darwin Tree Of Life all hands 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact the conference/meeting was meant to share and discuss progresses and next steps as part of the Darwin Tree Of Life project. Multiple working groups had the chance to meet and present their work with improvements on the understanding of the wide process
Year(s) Of Engagement Activity 2021
 
Description Data Carpentry 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Delivered Data Carpentry Material
Year(s) Of Engagement Activity 2022
URL https://datacarpentry.org/
 
Description Decision support systems for sustainable farming - breaking down the barriers to behaviour change - 23/07/2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact The aims of the workshop were threefold:
• Establish the state of play with respect to the use of data to support decision-making on farms
• Explore the barriers to more/better use of data/information/insight on farms
• Identify the scope for a pilot project involving key organisations from the Eastern Region, to gain a better understanding of on-farm decision-making and actual (as opposed to claimed or reported) behaviour with respect to the use of key inputs and their impact on performance
A total of 19 people attended the workshop (see appendix for details), with four representatives from three of the largest agricultural businesses in Norfolk, two regional agri-business consultants, one global agri-chemical manufacturer, one global farm machinery manufacturer and a regional farm machinery distributor, the Agricultural and Horticultural Development Board (AHDB), two companies working in the area of data warehousing, aggregation and visualisation, two computer scientists and a business development manager from the Earlham Institute, three senior academics (two from the School of Environment and one from Norwich Business School) and a relationship manager from the UEA.
The entire workshop was run in plenary, over three hours, split into three parts. The first re-visited the context (as summarised above), the second covered the current state of data use on farms, and the third explored the need for further research into on the efficacy of data use on farms to support decision-making and improve performance.
Year(s) Of Engagement Activity 2019
 
Description Development group meeting with EI and Sanger ToL staff 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This was the first time that the Sanger and EI teams have met in person becuase of covid. We discussed upcoming feature development, deployment and procedure relevant across this and other consortia e.g. ERGA, BGE
Year(s) Of Engagement Activity 2023
 
Description Dialogo industria-academia (Industry-academia dialogue) Bogota 2019 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact "Dialogo Industria-academia (Industry-academia dialogue)" was a workshop to promote partnerships between industry, University and Government, using the "triple helix model", in the areas of big data and bio-economy in Colombia. The event was held at Universidad de Los Andes campus and facilitated by trained facilitators from the Earlham Institute. The event included 16 participants from industry and Governmental institutions and 16 participants from Universities. During two days, they analysed the challenges and opportunities for the "big data" sector in Colombia and the tentative role of this sector in the socio-economic growth of Colombia. The 32 participants were selected from 123 applicants. The event included opportunities for "speed networking" between industry and Universities, and group activities to discuss the priorities for the data-driven innovation and economic growth.

"Dialogo Industria-academia (Industry-academia dialogue)". The workshop included two plenary speakers from the "Mision de Sabios", which is a panel of experts reporting to the President of Colombia about Science and Innovation policy. This event allowed us, participants and facilitators to engage with policymakers. The main objective of this event aligns with our aim to build partnerships between industry and academia.
Year(s) Of Engagement Activity 2019
 
Description Dialogue: food and health, what matters to you? - 24/10/2019 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Dialogue: food and health, what matters to you? was a new event at the Norwich Science Festival supported by Professor Neil Hall and Dr Nicola Patron. The event was an invitation to the public to join Norwich Research Park scientists and work together with them to solve the problems that you see, and inspire the science of the future.
Year(s) Of Engagement Activity 2019
URL https://docs.google.com/document/d/13yGwFkzh-9YDVUYMBzkhm7i_E4NnytZt74LBFrmjhNU/edit?usp=sharing
 
Description Down The Tubes! Talk at the Norwich Science Festival 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Dr Davey gave a talk on the internet and data science entitled "Down The Tubes!" at the 2018 Norwich Science Festival.
Year(s) Of Engagement Activity 2018
URL https://norwichsciencefestival.co.uk/events/down-the-tubes/
 
Description EI Innovate - 09/01/2019 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Website article highlighting the importance of knowledge exchange, reflecting on the success of the first EI Innovate KEC focussed event which including showcasing the CSP.
Year(s) Of Engagement Activity 2019
URL https://www.earlham.ac.uk/articles/ei-innovate-why-knowledge-exchange-important
 
Description EI Innovate event - 13/11/2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact The inaugural EI innovate event showcased EIs capabilities and expertise in genomics, bioinformatics, synthetic biology, crop phenotyping, and high-performance computing including the CSP. This event aimed to help industry, academics, charity and voluntary organisations fully understand the opportunities for collaboration with EI, and how engaging with EI science and services can be of benefit. EI Innovate: genomics data to advance bioscience, held on 13th November 2019, was a success. It was attended by 82 people, 26 EI staff who delivered talks, tours and discussion sessions, prepared posters and came to network, 31 representatives from industry (across agri-food, biotech and life sciences sectors), 2 BBSRC staff, 2 representatives from Food Standards Agency, representatives from KTN, NALEP, National Biofilms Centre and Centre for Process Innovation, and 14 representatives from (9 from UEA) academic institutions. We attracted 3 high profile external speakers from: Natural History Museum (Director of Research); Royal Botanic Gardens Kew (Director of Science); and Oxford Nanopore Technologies (Vice President for Applications). We delivered 5 talks, prepared 11 posters, led 2 tours of Genomics Pipelines and Bio-Foundry to showcase EI expertise and capabilities. The afternoon programme had 3 breakout sessions, aiming to explore opportunities for collaborative projects: 1. New Frontiers in Next-Generation Sequencing 2. Data Mining of the UK Tree of Life to Understand and Utilise Biodiversity of British Species 3. A guided discussion about the value of data for driving research, innovation and commercialisation This was EI's first externally-facing KEC event, bringing together various stakeholders to learn about EI expertise and how it can add value to their projects. We received a lot of informal positive feedback and following upon the interactions that this event enabled. A date has already been secured for November 2020.
Year(s) Of Engagement Activity 2019
URL https://www.earlham.ac.uk/innovate2019-0
 
Description EI Innovate: a platform for collaboration and new ideas 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact EI Innovate: a platform for collaboration and new ideas
Year(s) Of Engagement Activity 2020
URL https://www.earlham.ac.uk/articles/ei-innovate-platform-collaboration-and-new-ideas
 
Description EI Seminar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Gave EI seminar
Year(s) Of Engagement Activity 2024
 
Description EI Socioeconomic impact report 2022 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact News story to publicise a socioeconomic impact report undertaken on the Institute's research. In forecasting future impact, the report was designed to demonstrate the quality and breadth of our work to our funders, stakeholders, and the wider public. Figures from the report have since been used in presentations and at other events. Feedback has been positive and demonstrates an increased awareness of EI impact and the areas of research we're engaged in.
Year(s) Of Engagement Activity 2022
URL https://www.earlham.ac.uk/news/earlham-institute-economic-impact
 
Description ELIXIR all hands 2020 - poster: the Cyverse UK cyberinfrastructure offer to support open science and FAIRness 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact We presented a poster to highlight what the CyVerse UK infrastructure is supporting in term of all the services hosted (COPO, grassroots and Galaxy in particular) to improve the FAIRness of the data landscape. We also showed how the data is shared with collaborators trough federation of iRODS instances.
Year(s) Of Engagement Activity 2020
URL https://f1000research.com/posters/9-525
 
Description ELIXIR all hands 2021 participation 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact members of ELIXIR around Europe attended the all hands to discuss progresses in different working groups, data FAIRness, policies etc
Year(s) Of Engagement Activity 2021
URL https://elixir-europe.org/events/elixir-all-hands-2021
 
Description ELIXIR-UK ALL-HANDS MEETING 2017 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The ELIXIR-UK All Hands Meeting provided updates on recent activities from the ELIXIR UK Node and ELIXIR Hub, alongside discussions of future resources, events and roadmapping breakouts.Dr Davey presented the COPO project and CyVerse UK infrastructure as UK-specific resources that were being developed as national infrastructure for UK researchers. There was much interest from the participants in both projects, and conversations at this event led to the submission of a BBSRC TRDF with Gos Micklem (Cambridge), Dr Davey and Dr Shaw (EI).
Year(s) Of Engagement Activity 2017
URL https://www.elixir-europe.org/events/elixir-uk-all-hands-meeting-2017
 
Description Eagle Genomic Meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact This was the inaugral meeting between EI and Eagle Genomics to discuss potential collaborations going forward.
Year(s) Of Engagement Activity 2023
 
Description Earlham Institute at the Royal Norfolk Show 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Using our work in robotics and artificial intelligence as a launch pad, we were able to engage with a large swathe of the general public in the Innovation Zone at the Royal Norfolk Show.

The aim: to promote awareness of the important work that Earlham Institute does across all of its CSPs and NCs, particularly emphasising our computational, data-driven approach.

The outcome: hundreds of members of the public, as well as certain interested stakeholders, engaged in the work that we do. The setting was perfect to have a dialogue with particular groups of interest, including breeders and farmers, while opening the discussion about the role of genomics research and associated projects in influencing and advancing UK agriculture now and in the future.
Year(s) Of Engagement Activity 2018
 
Description Earlham Institute takes part in ambitious 'Tree of Knowledge' to map the genomes of all life in UK - 08/11/2019 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Website article highlighting that Earlham Institute (EI) has been awarded over 700k from the Wellcome Trust for the Darwin Tree of Life Project, aiming to produce a complete genomic UK biota as part of the wider Earth Biogenome Project (EBP). The remarkable study will understand all living species to preserve our planet's rich biodiversity and discover new biomaterials for pharmaceuticals. Some knowledge and expertise to deliver project based on the CSP-led advances. This story generated an estimated coverage views of 1.74M and 1.28K social shares.
Year(s) Of Engagement Activity 2019
URL http://www.earlham.ac.uk/newsroom/earlham-institute-tree-knowledge-genomics-biota
 
Description Elixir UK all hands - Norwich 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Engaged in networking group discussions
Year(s) Of Engagement Activity 2023
 
Description Elixir all-hands Dublin 2023 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presented a poster, and had discussion regarding development of ROCrate which led to several hackathons.
Year(s) Of Engagement Activity 2023
 
Description Emma's Antarctic Diary - 15/04/2019 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Website blog article highlighting the work of Emma Langan, a PhD student in EI's Leggett group who has been on a six-week expedition to Antarctica to analyse the effects of climate change on coccolithophores with real-time DNA sequencing algorithms. News stories highlight important updates that also have broad relevance and interest to the national and/or specialised media and showcase the work of our staff
Year(s) Of Engagement Activity 2019
URL http://www.earlham.ac.uk/articles/emma-langan-antarctic-adventure-algae-climate-change
 
Description Engagement with General Public: Royal Norfolk Show 2022 - Will Nash 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Engagement activities with the general public as part of the Royal Norfolk Show 2022, demonstrating our activities focusing on the application of new technologies to characterise biodiversity
Year(s) Of Engagement Activity 2022
 
Description European consortium launched to reverse biodiversity loss through genomics research 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Article about a new collaborative project for biodiversity genomics, jointly written with other European partners. Primary objectives were to promote the initiative and raise awareness of our involvement and expertise in data science and genomics, which we saw through the development of the article and in engagement online
Year(s) Of Engagement Activity 2022
URL https://www.earlham.ac.uk/news/european-consortium-launched-reverse-biodiversity-loss-through-genomi...
 
Description Event - DNA Detectives 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Activity at the Eden project
Year(s) Of Engagement Activity 2019
 
Description Evolutionary History and Domestication of the Ferret (Mustela putorius furo). Genome Science 2016 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The domestic ferret (Mustela putorius furo) models biological processes that are highly relevant to human disease and health research such as influenza, cystic fibrosis and asthma. The recently completed draft of the ferret genome and associated annotation project (Peng et al 2014) allows us to extend research and examine the domestication of the ferret. Most domestic animals were domesticated ~10,000 years ago, but the history of the ferret's domestication is uncertain. It is likely that ferrets have been domesticated for at least 2,000 years, a similar time to the domestication of the rabbit, of which the ferret was domesticated to hunt. We have sequenced the genomes of a further eight domestic ferrets along with 12 European Polecats (M. putorius), 2 Steppe Polecats (M. eversmanii) and 4 Black-footed Ferrets (M. nigripes). In order to examine the genetics underpinning ferret domestication and compare the genomes of domestic and wild ancestor, we first need to identify the ancestral species of the domestic ferret, the identification of which is not fully resolved.
Year(s) Of Engagement Activity 2016
 
Description Extreme environments: genome sequencing & space 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact EI website article on our fieldwork in Iceland
Year(s) Of Engagement Activity 2018
URL http://www.earlham.ac.uk/articles/extreme-environments-genome-sequencing-space
 
Description Focus on the future at EI Innovate 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Write-up of the EI Innovate event held at the Earlham Institute in November 2022, designed to provide an overview of the topics discussed to those who were unable to attend or may want to follow up with speakers having been here. The piece received some positive engagement and comments on social and will be a valuable tool for encouraging registration for the 2023 event.
Year(s) Of Engagement Activity 2022
URL https://www.earlham.ac.uk/articles/focus-future-ei-innovate
 
Description Genetic Diversity of the Mustelidae: Implications for understanding Evolution, Domestication and Conservation. Genome 10K 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The ferret (Mustela putorius furo) models biological processes that are highly relevant to human disease and health research such as influenza, cystic fibrosis (CF) and asthma. Since CF is a multi-organ disorder, research using CF ferrets span a range of organs, thus the impact of research in this model is broad and allows for the analysis of early stages of the diseases.

The recently completed draft of the ferret genome has allowed us to look more closely into ferret domestication and examine the underpinning processes of domestication from the ancestral European Polecat (M. putorius). We have sequenced eight domestic ferrets genomes and will sequence samples from European Polecat. Beneficial genetic variants increase in frequency due to positive selection together with linked neutral variants resulting in genomic islands of reduced heterozygosity between populations. By examining fixation indexes and heterozygosity between wild and domestic ferrets we can find genomic regions that have undergone selective sweeps.

Using whole genome sequencing also allows us to leverage the ferret genome sequence to look into the genetic diversity and evolution of other species of wild Mustelidae. The current population of the endangered Black-footed Ferret (M. nigripes) from North America stems from only seven individuals. Using whole genome sequencing of Black-footed Ferrets from both before and after the population crash we will identify genetic diversity not present in the current population with a view to reintroducing it via genome editing technology.

About two million years ago, the ancestral species of modern day Steppe Polecats (M. eversmanni) from Asia entered North America via the Bering Land Bridge and speciated into what is now Black-footed Ferret. Using a reference-free unbiased kmer approach, we will examine how closely-related Steppe Polecat is to both Black-footed Ferret and its sister species the European Polecat.
Year(s) Of Engagement Activity 2015
 
Description Genome-resolved metagenomics bioinformatics course, 31 Oct - 4 Nov 2022, EBI (Virtual) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Acted as trainer, teaching on long-read metagenomics, including on our MARTi software.
Year(s) Of Engagement Activity 2022
URL https://www.ebi.ac.uk/training/events/metagenomics-bioinformatics-2022
 
Description George Freeman MP, Minister for Transport visit - 20/09/2019 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Policymakers/politicians
Results and Impact In September 2019, George Freeman MP, Transport Minister, visited EI and address NRP staff. EI and CSP researchers lobbied him on our priorities and policy asks and staff were able to question him on the Government's Brexit policy as well as its wider science strategy
Year(s) Of Engagement Activity 2019
URL https://www.georgefreeman.co.uk/content/earlham-institute-visit
 
Description Gordon Research Conference in Single Cell Genomics 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presented poster, lightning talk and networked. Discovered collaborators with whom I am currently preparing a grant.
Year(s) Of Engagement Activity 2023
 
Description Host-microbe interactions in oral cavity Conference: Unilever Microbiology Symposium Location: Unilever R&D Colworth Science Park, Bedfordshire Date: 4-5 July, 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact Host-microbe interactions in oral cavity Conference: Unilever Microbiology Symposium Location: Unilever R&D Colworth Science Park, Bedfordshire Date: 4-5 July,
Year(s) Of Engagement Activity 2019
 
Description I'm a scientist get me out of here, Darwin Tree of Life 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact A class of pupils came together in an online chat to ask scientist and other professionals involved in the Darwin Tree of Life project questions about science, career, general curiosity etc
Year(s) Of Engagement Activity 2021
 
Description Image Processing Carpentries 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact Devlivered Training
Year(s) Of Engagement Activity 2023
 
Description Image Processing Carpentry 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Delivery of pilot material for image processing course.
Year(s) Of Engagement Activity 2022
 
Description Improving the Koala genome using the PDABS pipeline AGBT 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Advances in next-generation sequencing (NGS) technologies and the subsequent reduction in cost have enabled the research community to sequence the genomes of many non-model organisms. Genome assemblies using next generation technologies such as Illumina show high quality nucleotide level information, but are fragmented due to the inability to retain contiguity. The absence of low-cost methods for creating high quality genome sequences from a wide range of organisms currently hampers the generation of de novo genomes for comparative genomic studies, and new methodologies and data types need to be tested to achieve this goal. We have integrated NGS with nano-channel genome mapping and developed the PCR-free Discovar Assembly BioNano Scaffolding (PDABS) pipeline to assemble cheap, contiguous and robust genomes.
Year(s) Of Engagement Activity 2016
 
Description Inside EI - 21/05/2019 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Inside EI is an open day when the public are invited to learn more about the work at the Earlham Institute, meet our team and engage with our research and understand its relevance and impact. There were talks and posters based on CSP-led research.
Year(s) Of Engagement Activity 2019
URL https://earlhaminstitute.coveragebook.com/b/0ee62ba7
 
Description Inside EI: Public engagement with impact - 06/08/2019 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Website highlighting the Inside EI event for the public to explore our fascinating and diverse range of important scientific research projects including talks and posters on our CSP work and the importance and impact of public engagement in research
Year(s) Of Engagement Activity 2019
URL http://www.earlham.ac.uk/articles/inside-ei-public-engagement-science-impact
 
Description Integrating omic-based technologies for the valorisation of Peruvian crop biodiversity 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This three-day workshop aimed to understand the potential application of high-throughput "omics-technologies" for the characterization and valuation of the genetic biodiversity of Peruvian crop in a more holistic and integrated approach.
The workshop convened UK and Peruvian leading researchers and early-career researchers, with the aim of outlining the best strategies to integrate omic approaches in the research of Peruvian crop biodiversity. This contributed to developing resilience in local agriculture under the context of climate change and the current demand for healthier foods and natural compounds.
Developed in collaboration with Universidad Catolica de Santa María, this workshop has been granted by the Newton-Paulet fund under the program "Researcher Links - Workshop Grants - Talleres 2018-01". This grant is the result of an agreement between the British Council and the Consejo Nacional de Ciencia Tecnologia e Innovacion (CONCYTEC) from Peru.
Year(s) Of Engagement Activity 2019
 
Description Invasive species threat to native tilapia biodiversity 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Invasive species threat to native tilapia biodiversity
Year(s) Of Engagement Activity 2020
URL https://www.earlham.ac.uk/newsroom/invasive-species-threat-newly-discovered-native-tilapia-biodivers...
 
Description Invited talk - Automating sample preparation for nanopore real-time sequencing applications (Miroculus Science Simplified, 24 Mar 2022) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Richard Leggett and Darren Heavens spoke at the Miroculus Science Simplified virtual symposium.
Year(s) Of Engagement Activity 2022
URL https://miroculus.com/videos/nanopore-real-time-sequencing-applications/
 
Description Invited talk - Improving nanopore sequencing outputs using the femto pulse (Agilent User Group) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Darren Heavens was asked to present at the Agilent UK Genomics user group meeting in London.
Year(s) Of Engagement Activity 2020
URL https://twitter.com/EarlhamInst/status/1227529033539313664
 
Description Invited talk - In situ sequencing, automated sample prep, and real-time analysis using MARTi and Miro (Extreme Microbiome Group, 3 May 2022) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Richard Leggett invited to present at the Extreme Microbiome Group virtual meeting, 3rd May 2022.
Year(s) Of Engagement Activity 2022
 
Description Invited talk - Real-time gut microbiome diagnostics using nanopore sequencing (Cambridge Nanopore Day, 22 Nov 2022) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Richard Leggett invited speaker at Oxford Nanopore Technologies' Nanopore Day Cambridge, CRUK Cambridge Institute, 22 November 2022
Year(s) Of Engagement Activity 2022
URL https://nanoporetech.com/nanopore-day-cambridge
 
Description Invited talk - Real-time sequencing and analysis of microbial communities with nanopores 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Research day on microbe research attended by approx. 200.
Year(s) Of Engagement Activity 2019
URL http://microbesinnorwich.org/
 
Description Invited talk - Unbiased detection of airborne pathogens with Air-seq (Agri-TechE Farmer First Innovation Group, 18 Oct 2022) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact Invited to present Air-seq work to the Agri-TechE Farmer First Innovation Group
Year(s) Of Engagement Activity 2022
 
Description JRS Biodiversity R workshop, Nairobi 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Dr Davey travelled to Nairobi, Kenya, as part of a JRS Biodiversity funded programme to teach a day R workshop to 23 representatives from the African Conservation Centre, Kenya Wetlands Biodiversity Reseach Centre, National Museums of Kenya, University of Nairobi, and the Jomo Kenyatta University.
Year(s) Of Engagement Activity 2019
 
Description Keynote lecture: Assembling complex crop genomes for comparative analyses 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Bioinformatics for Plant Biology - EBI Cambridge, 6-9 November
Year(s) Of Engagement Activity 2018
URL https://www.ebi.ac.uk/training/events/2018/bioinformatics-plant-biology
 
Description Laying the Foundations; Why are Semantics in Agriculture Difficult? - PAG 2020 talk in Plant Phenotypes workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Dr Davey gave an invited talk to approx 90 attendees at the PAG 2020 workshop "Plant Phenotypes"
Year(s) Of Engagement Activity 2020
 
Description Lectures to third year UEA undergraduates on next generation sequencing and the human genome project (2022, 2023) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact Lectures to third year UEA undergraduates on next generation sequencing and the human genome project
2 lectures in 2022, 2 in 2023
Year(s) Of Engagement Activity 2021
 
Description Leszek Wysocki (Department for International Trade ) 10/06/2019 - visit to EI 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Policymakers/politicians
Results and Impact Leszek Wysocki (Department for International Trade) visited EI and met with CSP researchers to discuss their work in relation to International Trade
Year(s) Of Engagement Activity 2019
 
Description Lettuce Have It! - 10/06/2019 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Website article highlighting real-life applications for artificial intelligence based techniques to improve efficiency and precision on the farm developed at EI and through the CSP. Linked to the paper, titled: Combining computer vision and deep learning to enable ultra-scale aerial phenotyping and precision agriculture: A case study of lettuce production is published in Horticulture Research - Nature. News stories highlight important updates that also have broad relevance and interest to the national and/or specialised media, the article has estimated coverage views of 111K
Year(s) Of Engagement Activity 2019
URL http://www.earlham.ac.uk/newsroom/lettuce-have-it-machine-learning-cr-optimisation
 
Description Meeting with Brown & Co, Agri-Business Consultancy Department 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact Charles Whitaker came to EI in September 2019 on invitation from Saskia Harvey to discuss GROW Colombia links (they have clients in Colombia). During that meeting we identified a number of areas outside of Colombia interest, that Brown & Co wished to explore further.
Year(s) Of Engagement Activity 2020
 
Description Monthly Sample Working Group Meetings 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I am a member of the Darwin Tree of Life and European Reference Genome Atlas's sample working groups. These groups are where the metadata requirements of the ToL projects are discussed and disseminated. I have been able to influence metadata collection practices in these work groups.
Year(s) Of Engagement Activity 2021,2022,2023
 
Description NanoOK of the North and South 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact EI website article about our work developing the nanopore analysis software NanoOK
Year(s) Of Engagement Activity 2019
URL http://www.earlham.ac.uk/articles/nanook-north-south-life-in-antarctic
 
Description Network biology approaches to understanding life - 29/04/2019 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Website article highlighting the work of a PhD student in the Korcsmaros group on how systems biology approaches can help us to understand the interactions between living systems and how the information we can glean is important for improving human health and better understanding the evolution of host-specificity of bacteria, among many other things. This article also provide insight into carrying out a PhD at EI and working in the field of systems biology and Salmonella research.
Year(s) Of Engagement Activity 2019
URL http://www.earlham.ac.uk/articles/network-biology-approaches-to-understanding-life
 
Description New bioinformatics tool spots hybrid fish that threaten the survival of natural tilapia populations in aquaculture 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact News letter describing the development of a cost effective genotyping platform enabling tilapia species discrimination and species identification
Year(s) Of Engagement Activity 2021
URL https://www.earlham.ac.uk/newsroom/new-bioinformatics-tool-spots-hybrid-fish-threaten-survival-natur...
 
Description New method and model address blindspot towards uncommon species in mixed samples 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Website article about paper
Year(s) Of Engagement Activity 2022
URL https://www.earlham.ac.uk/newsroom/new-method-and-model-address-blindspot-towards-uncommon-species-m...
 
Description New rapid test diagnoses pneumonia and lower respiratory tract infections - 24/06/2019 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Website article highlighting research carried out by Quadram Institute, University of East Anglia (UEA) and Earlham Institute (EI) to develop a new, rapid way of diagnosing bacterial lower respiratory tract infections in hours rather than days that could improve patient care and slow the spread of antimicrobial resistance. The MinION technology was applied using software developed at EI as part of the CSP. Article showcased collaborative work between the research institutes and NHS. News stories highlight important updates that also have broad relevance and interest to the national and/or specialised media.
Year(s) Of Engagement Activity 2019
URL https://www.earlham.ac.uk/newsroom/new-rapid-test-diagnoses-pneumonia-and-lower-respiratory-infectio...
 
Description Newsletter - Developing a frugal and medium throughput method for assessing protein-DNA binding affinity 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact News Letter for Open Plant
Year(s) Of Engagement Activity 2019
URL https://www.openplant.org/blog/2019/4/4/developing-a-frugal-and-medium-throughput-method-for-assessi...
 
Description Norwich Research Park Day at the Norwich Science Festival 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact This, a celebration of science from across the NRP, was a fantastic chance for us to showcase the work across our collective CSPs and NCs.

All in all, we had dozens of staff involved from across the institute who helped us to deliver a set of multiple activities, workshops and talks on the day. Activities included a live LEGO sequencer with BLAST analysis of genomes and a live DNA sequencing experiment. Additionally, there were talks on the Institute and how it applies big data and computational approaches to understanding life on earth, the interactions underpinning ecosystems and communities, as well as how we can better understand evolution to drive trait improvement.

The event was a spectacular success, with desired outcomes being a greater awareness of the important research undertaken by EI, as well as the value of applying computational approaches to tackling a swathe of biological questions. The specific feedback showed that this event had definitely increased understanding among participants, who numbered more than 8000, of the role that EI and its core programmes play at the cutting edge of science.

Social media coverage ensured that the activities we put on reached many thousands more people than just those who attended the event on the day.
Year(s) Of Engagement Activity 2018
URL https://www.youtube.com/watch?v=Uf0h8Q8PVxI
 
Description Norwich Science Festival - 22/10/2019 to 24/10/2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Norwich Science Festival is a wonderful showcase of science from Norwich and beyond for learning and excellent public engagement. This year, EI presented 11 talks and poems covering a fascinating range of science topics, from Salmonella and guts through to sequencing algae in the Antarctic, covering the whole tree of life in the process, bringing an innovative Bee Trail, LEGO sequencer, robots and more. Researchers from the CSP participated, presented their work and engaged with the public. Coverage on TV, radio and media with estimated coverage views of 698K
Year(s) Of Engagement Activity 2019
URL https://earlhaminstitute.coveragebook.com/b/02819684cc7da5de
 
Description Norwich Science Festival - Lego DNA sequencer 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact We created a working Lego Mindstorms model of a DNA sequencer, along with software to run it. We arranged activities around it during the Norwich Science Festival and engaged lots of adults and children.
Year(s) Of Engagement Activity 2018
URL https://twitter.com/brickopore/status/1055043270999465985
 
Description Norwich Science Festival - The Nedome 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact During Norwich Science Festival, we ran live demonstrations of nanopore sequencing in which we sequenced the "Ned-ome" (DNA kindly provided by Ned, a PhD student).
Year(s) Of Engagement Activity 2018
URL https://twitter.com/hashtag/nedome?lang=en
 
Description Norwich Science Festival Ops Note 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact News stories highlight important updates that also have broad relevance and interest to the national and/or specialised media. Collectively, our international reach as of March 2019 has extended across the globe, with highlight pieces in the Guardian, the BBC World Service, the Washington Post and more, as well as local TV and radio. In 2018, the estimated readership of news stories we shared was well in excess of 1 million.
Year(s) Of Engagement Activity 2018
URL http://www.earlham.ac.uk/newsroom/first-ei-lego-sequencer-human-dna-and-endangered-species-earlham-i...
 
Description Norwich Science Park Open Day 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Presented poster on work linked to Big Data Bioinformatics
Year(s) Of Engagement Activity 2016
 
Description Oceans Day at Norwich Science Festival 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Tarang Mehta gave a well-received talk on how he delves into cichlid fish diversity for the Norwich Science Festival.

He had given a similar talk the year before, and this was an update due to its roaring success the first time around.

The talk helped to promote an awareness of the importance of understanding the evolution of interesting groups of organisms such as cichlid fish, and how this can be applied to a number of interesting research projects, from better understanding human disease through to improving aquaculture practice in East Africa.
Year(s) Of Engagement Activity 2018
 
Description Open source data management: Sherlock open for all - 14/06/2019 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Website article on customisable method of data management for smaller research groups who don't have the resources to develop a modern, big data solution using software developed at EI and made open-source.
Year(s) Of Engagement Activity 2019
URL http://www.earlham.ac.uk/articles/sherlock-open-source-data-management
 
Description Oral Presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Oral presentation as part of the annual PopGroup meeting
Year(s) Of Engagement Activity 2021
 
Description Oral Presentation - Characterization of noncoding expression and splicing diversity in the human brain 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Seminar at the Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
Year(s) Of Engagement Activity 2019
 
Description Oral Presentation: Aequatus.js: a plugin to visualise gene trees in Galaxy 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact More than 300 people attended the conference and had a good discussion after the talk about the way it is integrated with Galaxy.
Year(s) Of Engagement Activity 2019
URL https://gcc2019.sched.com/
 
Description Oral presentation - Non-human genomes, why bother? 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact Oral presentation at the EI Open Day
Year(s) Of Engagement Activity 2019
 
Description Oral presentation - Building gene Families and assessing gene structure 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Presentation at the Genome Annotation Workshop 2021
Year(s) Of Engagement Activity 2021
 
Description Oral presentation - Characterization of splicing diversity and gene fusions through Nanopore sequencing 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Oral presentation at the Long Read Sequencing Meeting, SciLifeLab Uppsala, Sweden
Year(s) Of Engagement Activity 2019
 
Description Oral presentation - Characterization of splicing diversity in bulk samples and single cells using long read technologies 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Seminar at the Maurice Wohl Clinical Neuroscience Institute King's College London
Year(s) Of Engagement Activity 2019
 
Description Oral presentation - Detection of differential isoform expression and usage during cellular differentiation using long read RNA sequencing 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation at the London Calling 2021 meeting
Year(s) Of Engagement Activity 2021
 
Description Oral presentation - Enter the Nanoporium 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Oral presentation as part of the Pint of Science
Year(s) Of Engagement Activity 2019
 
Description Oral presentation - How well do you know your family tree? 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Oral presentation at the Norwich Science Festival
Year(s) Of Engagement Activity 2019
 
Description Oral presentation - Taking the long (read) view of alternative splicing during cell differentiation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation at the EI Long Read Symposium, Earlham Institute (virtual) 15 - 17th June 2021
Year(s) Of Engagement Activity 2021
 
Description Oral presentation - What Can We Learn from a High Koala-ty Genome?' 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact Orap presentation as part of the EI Open Days
Year(s) Of Engagement Activity 2019
 
Description Organiser of Challenges and Opportunities in Plant Science Data Management PAG workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Co-organiser of Challenges and Opportunities in Plant Science Data Management PAG workshop, which saw 6 international speakers deliver presentations on various aspects of data management in the plant sciences. Approx 50 attendees.
Year(s) Of Engagement Activity 2020
 
Description Organoid transcriptomics to study cell regulation Internal research presentation at the Earlham Institute 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Organoid transcriptomics to study cell regulation Internal research presentation at the Earlham Institute
Year(s) Of Engagement Activity 2019
 
Description PCA Core Network Participant Meeting (CPN) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Made connections and panned for future grant submission.
Year(s) Of Engagement Activity 2023
 
Description PRESS RELEASE: How to escape big data and cancer with data integration and broccoli 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact How to escape big data and cancer with data integration and broccoli
Year(s) Of Engagement Activity 2015
 
Description PRESS RELEASE: NGS expert leads world-class Institute's cutting-edge genome analysis 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact NGS expert leads world-class Institute's cutting-edge genome analysis
Year(s) Of Engagement Activity 2015
 
Description PRESS RELEASE: NanoOK: Quality Control for portable, rapid, low-cost DNA sequencing 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact NanoOK: Quality Control for portable, rapid, low-cost DNA sequencing
Year(s) Of Engagement Activity 2015
 
Description PRESS RELEASE: New big data pushes bioinformatics training in life sciences 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact New big data pushes bioinformatics training in life sciences
Year(s) Of Engagement Activity 2015
 
Description PRESS RELEASE: Novel online bioinformatics tool significantly reduces time of multiple genome analysis 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact Novel online bioinformatics tool significantly reduces time of multiple genome analysis
Year(s) Of Engagement Activity 2015
 
Description PRESS RELEASE: TGAC's take on the first portable DNA sequencing 'laboratory' 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact TGAC's take on the first portable DNA sequencing 'laboratory'
Year(s) Of Engagement Activity 2015
 
Description Pint of Science 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Will Nash gave a well-received talk about his work understanding the koala genome, and what we can learn from understanding it - particularly how koalas can somehow withstand a toxic diet of eucalyptus.

Pint of Science is a great occasion to share information with the public in an informal setting, which stimulates and encourages debate and discussion, therefore the opportunity for a dialogue - which is a valuable outcome for any public engagement event.
Year(s) Of Engagement Activity 2018
 
Description Plant Health Week 2022 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact As part of Plant Health Week, EI volunteers took part in a Norwich Research Park open day where we ran a stand showcasing air sequencing technology developed at EI. The objective was to showcase local science and its impact, invite the public to ask questions, and hopefully to inspire them - perhaps to take a keener interest or, for children, to consider a STEM career. We saw professionals attend our stand and ask about potential collaborations, as well as receiving positive feedback about how our technology could have an impact in a range of applications.
Year(s) Of Engagement Activity 2022
URL https://www.tsl.ac.uk/news/celebrating-plants-of-the-future-with-norfolk-locals
 
Description Pole to Pole: breaking point for ocean's microbial biodiversity most likely to affect UK 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Article linked to a publication
Year(s) Of Engagement Activity 2021
URL https://www.earlham.ac.uk/newsroom/pole-pole-breaking-point-oceans-microbial-biodiversity-most-likel...
 
Description Poster Presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation as part of the Biodiversity Genomics conference
Year(s) Of Engagement Activity 2020
 
Description Poster Presentation - 200 Genomes EI Open Days 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact Poster Presentation EI Open Days
Year(s) Of Engagement Activity 2019
 
Description Poster Presentation - Analysing intestinal organoids in a multi-omics, systems biology framework to investigate gut health and host-microbe interactions 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Presentation in front of the Earlham Institute Scientific Advisory Board
Year(s) Of Engagement Activity 2018
 
Description Poster Presentation - Brain Isoforms EI Open Days 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact Poster Presentation as part of EI Open Days
Year(s) Of Engagement Activity 2019
 
Description Poster Presentation: Aequatus.js: a plugin to visualise gene trees in Galaxy 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presented a poster in addition to an oral presentation at GCC 2019, it had good feedback from audience.
Year(s) Of Engagement Activity 2019
URL https://gcc2019.sched.com/
 
Description Poster presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation as part of the Biodiversity Genomics Conference
Year(s) Of Engagement Activity 2020
 
Description Poster presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the annual PopGroup meeting
Year(s) Of Engagement Activity 2021
 
Description Poster presentation - Evolution of gene regulatory networks controlling traits under natural selection in cichlids 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the volutionary Systems Biology Conference, Wellcome Trust Sanger Centre. The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - Characterising and visualising gene families with GeneSeqToFamily and Aequatus 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the Genome Informatics 2018, Cambridge, UK . The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - Examining genetic diversity and traits under selection in several aquaculture-relevant tilapia species 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the Genome Science Conference, Nottingham, UK . The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - Gene regulatory network evolution in East African lake cichlids 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the Evolution 2018 conference, Montpellier. The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - Identifying signalling pathways regulating antimicrobial peptide production in the gut using network biology and organoid transcriptomics 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the Modularity of signaling proteins and networks conference Seefeld, Austria. The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - Networks to catch the difference: Construction and analysis of Regulatory Networks applied to East African Lake Cichlids 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the Evolutionary Systems Biology conference, Wellcome Trust Sanger Centre. The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - The evolution of mammalian microRNAs 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the Biology of Genomes, Cold Spring Harbor Laboratory. The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - Aequatus.js: a plugin to visualise gene trees in Galaxy 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the Galaxy Community Conference (GCC2019) - Freiburg, Germany
Year(s) Of Engagement Activity 2019
 
Description Poster presentation - Characterising and visualising gene families within Galaxy using GeneSeqToFamily and Aequatus 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the SMBE Annual Meeting 2019, Manchester, UK 21-25 July 2019
Year(s) Of Engagement Activity 2019
 
Description Poster presentation - Gene regulatory network evolution in East African lake cichlids 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Poster presentation as part of the PopGroup 52, Oxford. The audience included group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2019
 
Description Poster presentation - Investigating regulation of intestinal function by Bifidobacteria using network biology and organoid approaches 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the Microbes in Norwich workshop. The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2019
 
Description Poster presentation - Nanopore sequencing reveals the transcriptional complexity of neuropsychiatric disease genes in human brain 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation Biology of Genomes, Cold Spring Harbor Laboratory. The audience consisted group leaders, University professors, University Lecturers, postdoctoral researchers, graduates and undergraduates students.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - Real-time metagenomic analysis with MARTi (AGBT 2021) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at AGBT 2021 conference.
Year(s) Of Engagement Activity 2021
URL https://www.agbt.org
 
Description Poster presentation - Real-time surveillance and diagnostics with NanoOK RT 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the London Calling conference, May 2018
Year(s) Of Engagement Activity 2018
 
Description Poster presentation - The long and the short of eukaryotic metagenomics: Identification and quantification of plant species in bee pollen 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners