EBI Metagenomics Portal - Towards a better understanding of community metabolism

Lead Research Organisation: European Bioinformatics Institute
Department Name: Sequence Database Group

Abstract

Metagenomics is a rapidly expanding field, where modern sequencing technologies are applied to the DNA isolated from an environment, such as soil, seawater or animal gut. This allows us to analyse the DNA from the collection of microorganisms that inhabit the environment, many of which may not have previously been sequenced due to their fastidious nature, making it difficult to culture them in the laboratory. Substantial amounts of sequencing data are produced, which typically only reflect a small fraction of the total DNA. However, after analysis, this DNA provides clues to the taxonomic diversity and functional potential that is found in that environment.

Metagenomics is an exemplar of what is referred to as data-driven biology. The cost of DNA sequencing has fallen rapidly, such that most research groups have access to affordable institutional sequencing facilities. We now run the risk of each research group producing its own metagenomics analysis platform and storage repository. To help prevent such a fragmented situation, and to help reduce overall costs (both computational and staff), we have developed and implemented a centralised pipeline to provide a metagenomics data analysis and archiving platform.

The EBI Metagenomics Portal (EBI-MP) has been developed as a collaborative UK effort at the EMBL-European Bioinformatics Institute (EMBL-EBI). The portal was launched in 2011 and has increasingly become established as a world leader in metagenomic analysis. Users submit their high-throughput metagenomic nucleotide sequence data, accompanied by contextual data describing their samples and experiments in a controlled and consistent manner. Once analysed by the portal, results including taxonomic and functional annotations can be visualised alongside project and sample descriptions, and can be downloaded for further analysis.

In order to keep up with community needs, and to help cement the UK's position as a leader in the field of metagenomics research, the proposed project aims to further develop the portal. This will include the addition of analysis provenance, to ensure that analyses can be re-run in the future as new and updated tools and algorithms become available, without overwriting existing analysed data, which may form the basis of an existing publication. Further enhancements include extending the taxonomic and functional analyses to achieve a better picture of the microbial communities sampled, their composition and biological functions. The portal will also be developed to allow more sophisticated searches of its data, so that users can discover samples or environments based on the kind of microbe or protein function found there. We will also add the ability to perform statistically rigorous cross-sample comparisons, that will allow analysis results from different samples to be compared in a scientifically meaningful way, and provide visualisation tools for such comparisons. With today's modern sequencing technologies producing ever more data, better data compression is essential to speed up data transfer both into EBI-MP and internally within the resource. To this end, we will implement industry-standard compression data structures that have developed at EMBL-EBI.

Technical Summary

EBI-MP is a global portal for the metagenomics research community. Offering data submission, archiving and sharing functions, community standards-compliant curation, and functional and taxonomic diversity analyses, the service has attracted a growing user-base of UK, European and global researchers.

We intend to improve the pipeline infrastructure to offer analysis provenance by modularising pipeline components, defining a dependency tree between modules, and module versioning. Subsequently, we will perform updates to reference databases and analysis software, and make results reanalysis with the updated pipeline actionable for our users. We will improve the range of taxonomic annotations provided by the resource, moving beyond 16S rRNA-based analyses. We will also investigate the application of the UniPept approach to taxonomic classification for metagenomic datasets. We will add pathway information to the functional annotation provided by EBI-MP, using the latest version of InterProScan to provide KEGG, MetaCyc and UniPathway links, and develop a tool to visualise the catalytic potential of a sample, highlighting reactions where there is support for the existence of constituent proteins.

We will implement CRAM compressed sequence data formats within the system to increase the speed of upload of data to EBI-MP and to facilitate internal processing and storage. We will also design and build data discovery tools that provide a full range of search functions across the sample, contextual and analysis data, and provide these tools as web services and via the website. Finally, we will develop mathematically sound methods to estimate depth of sequencing required to capture a specific fraction of diversity, and to normalise samples so that they can be compared in statistically meaningful ways. These analyses will be provided from the website, along with visualisation tools capable of producing heatmaps and PCA plots for sample comparison.

Planned Impact

The use of metagenomics is widespread, with its application in such diverse fields as agriculture, food manufacture, the elucidation of antibiotic resistance mechanisms, bioenergy, and animal/human health. The EBI Metagenomics Portal (EBI-MP) covers data submission, archiving and sharing functions, community standards-compliant curation, and rich functional and taxonomic diversity analyses. Launched in 2011, the resource has become a world leader in metagenomics data analysis, attracting a growing userbase across the UK, European and global communities. The impact on academic research is already in effect, with the EBI-MP providing both a robust analysis platform and access to a large compute resource. Both of these features are often lacking within academia. Thus, the EBI-MP is making metagenomics analysis available to more researchers, and relieves a significant bottleneck between data and interpretation. One vital impact of the project will be continued support for archiving and analysis of metagenomic data in the face of ever increasing data volumes. The proposed work provides a number of mechanisms, including adoption of CRAM-based data compression technology and a tightly controlled way of updating analysis algorithms, by which the pipeline can be made more efficient, with higher throughput and the ability to scale. Improved sample analyses, through updated reference databases and extended taxonomic and functional analyses, are also critical, since they will increase the usefulness of EBI-MP to researchers and better meet the community's needs. These benefits will be felt in the short term, and will also persist into the longer term, as updates and improvements are made throughout the course of the project. In the medium term, these developments will allow the EBI-MP to grow with increasing demand, without significantly increasing the computational overhead. This will be achieved by the incorporation of more efficient algorithms, thereby increasing throughput. Updating the reference database will facilitate a more in-depth functional and taxonomic analysis, as more diverse organisms are represented in them. The infrastructural changes to the pipeline will also allow other tools to be more easily incorporated into the analysis platform, not only providing scientific exposure to the tool developer, but also enriching the analysis results. Our objective of improving data discoverability, by linking from other databases to the EBI-MP, will allow metagenomics results to reach a broad life science community, where individuals may be unaware of the data. It is important to note that, in this project, we are also establishing a new collaboration with Newcastle University that builds on our cross-scientific discipline culture. This will expose the existing EBI-MP team to novel approaches and scientific challenges. At the same time, from these collaborations, we aim to produce statistical protocols to provide additional confidence and information about the data. Cross-sample analyses will inevitably provide researchers with a significantly deeper understanding of complex communities. In the medium and longer term, the knowledge gained from understanding complex communities will have significant impacts for the UK. Impacts include the economic, from efficiency-enhanced industrial enzymes, through improved soil conditions providing greater crop yields, to healthcare solutions by comparing diseased and healthy states. One of the key areas will be the translation of metagenomics to industry. Through our industrial links at EMBL-EBI and Newcastle University, we will engage with this sector, to establish requirements. To ensure our users are able to utilise the new features we will provide online training material, publish in scientific and non-scientific literature, attend meetings and conferences aimed at a range of audiences and run training workshops, to maximize dissemination into the academic, industrial and 3rd-party communities.

Publications

10 25 50
 
Title Interview as part of the Gut Stuff series 
Description Interview with the MacTwins about big data and what metagenomics.This video covers the following topics, What is Big Data and what does it mean for medical research? How is data analysed? How does it play into the bigger picture for gut health/disease? Delve deeper into this episode here: https://youtu.be/FILjgbudTWs and https://youtu.be/eXTb5MbagKY For more information such as recipes, further reading, articles and advice, head over to www.thegutstuff.com 
Type Of Art Film/Video/Animation 
Year Produced 2017 
Impact Raising the profile of the importance of the gut microbiome with a generalist, public audience. 
URL https://www.youtube.com/watch?v=s2PExZElXbc
 
Title Recording of presentation at workshop on metagenomics 
Description Online webinar formed out of participating in metagenomics workshop. 
Type Of Art Film/Video/Animation 
Year Produced 2017 
Impact 45 views on you tube. 
URL https://www.youtube.com/watch?v=mASvV9e069o
 
Title Tara Oceans video 
Description Give the background to the Tara Oceans project and what we have done with the data at EMBL-EBI. 
Type Of Art Film/Video/Animation 
Year Produced 2016 
Impact Over 500 view on You Tube. 
URL https://www.youtube.com/watch?v=Mi5otyfA-H8
 
Description Over the course of the project, we have seen rapid growth in uptake, both from the data provider and user communities. We have increased the throughput at which datasets are analysed, leading to an increase by over 10-fold, processing over 300 billion nucleotide sequences, to become the world's largest provider of publicly available metagenomics data. We have re-evaluated our approach to taxonomic analysis and now perform analysis of all cellular life (not just bacteria), giving a much more complete picture of the microbial communities present in an environment. We have also developed our search interfaces and systems for providing data, allowing users to perform large-scale analysis on the results that we house.

As well as the hundreds of small projects, the resource contains the analysis of many high-profile showcase studies, such as Tara Oceans, Ocean Sampling Day, American & British Gut, MetaHIT, and METASOIL, which form important reference sets (for example, monitoring the health of the world's oceans). We have extended our interfaces to allow easier discovery of datasets, allowing the selection of facets to enable rapid filtering using a few simple selections.

As part of the project, we have begun to offer assembly of metagenomics data as a service. This allows recovery of longer length peptides or full protein sequences from metagenomics datasets. The also provide access to the tremendous wealth of genetic material, which may be mined for industrial biotechnology and medical applications. Using this approach, we have developed a peptide database of more than 330 million unique sequences from metagenomic sources. Working with an SME biotech company (BioCatalysts Ltd) as part of an InnovateUK BBSRC grant, we have used this database to identify novel enzymes that the company is now marketing.

We have updated the analysis pipeline three times over the course of this grant, enabling wider and more fine-grained annotations and greatly increased analysis throughput. In February 2015, a new version of the analysis pipeline (v2.0) was launched, with updated analysis components and reference libraries. The changes included updates to Biopython (v1.65), rRNA Selector (v1.0.1), InterProScan (v5.9-50.0), QIIME (v1.9.0) and GreenGenes (v13.8). In addition, the clustering and repeat masking steps that formed part of the QC stage in pipeline v1.0 were removed, to increase efficiencies. As part of the update, the pipeline code was also substantially refactored, improving performance and increased modularisation, ensuring that additional component upgrades could be achieved more easily in the future. Since then, we have released two more versions (v3.0 and v4.0), updating most tools and algorithms in the process. In particular, we integrated a new Gene Ontology slim for visualisation. We have also implemented a tool for the tracking of metagenomics data through the archiving process, to allow better management and troubleshooting. The QC steps of the pipeline were also extended to cover additional metrics and improved QC visualisations - these are now harmonised with MG-RAST, providing consistency and helping users transition from one site to the other. In version 4.0 the entire taxonomic assignment arm of the pipeline was replaced. We now use Rfam models to detect prokaryotic and eukaryotic SSU and LSU, analysed with MAPseq version 1.2, which offers fast and accurate classification of reads, and provides corresponding confidence scores for assignment at each taxonomic level. The GreenGenes reference database was replaced with SILVA SSU / LSU version 128. In addition to the ribosomal subunit RNAs, the pipeline also extracts other non-coding RNAs. Data analysis using the new pipeline is over 15x faster than with v1.0. As a result of these updates, data volumes analysed by the resource have grown considerably, with a 40-fold increase in the number of datasets analysed by the resource in the last three years. EBI Metagenomics currently houses over 100,000 datasets, making it one of the largest metagenomics repositories in the world. Finally, we have undertaken a feasibility study to assess whether assembly can be provided as a service, which concluded that the majority of datasets could be assembled without major increases in computational overheads but will be constrained by the availability of high memory machines.

CRAM has become a supported submission file format through both interactive and RESTful submissions within the Webin framework (European Nucleotide Archive, ENA) , with support for both primary raw data and derived data (e.g. trimmed/filtered reads). Full syntactic validation is provided for incoming CRAM data files with errors reported to submitters. Having completed a data feed to allow all the ENA sequences to be propagated systematically into the CRAM Reference Registry, any public sequence can be used as reference by submitters in their CRAM files, and we continue to support new reference sequences not in ENA. All submitted CRAM files undergo indexing and production of a corresponding CRAI (CRAM index file) for each available CRAM. All CRAM and CRAI files are stored permanently in the ENA object store. Retrieval of CRAM-submitted data is supported in CRAM and standardised FASTQ formats. The metadata surrounding CRAM files are indexed at ENA, enabling web and RESTful discovery by attributes of studies and samples. In case of data held confidential prior to publication, where the user authorises early access for EMG analysis, CRAM and CRAI and derived FASTQ objects can be made fully available for consumption by EMG. Having delivered a portion of the above work with support from funding outside the MGPII grant, we have been able to extend work relating to compression of metagenomics sequence data to include an assessment of opportunities in compression. Basing our assessment on a broad range of shotgun metagenomics data sets, including Tara Oceans, we have constructed a comprehensive reference sequence data set, comprising a representative assembled complete isolate genome from each sequenced prokaryotic family and all available contigs from metagenome assemblies, as available in early 2017. We ran mappings of data from different biomes as a measure of compressibility. (For low reference size, mapping rate is a good predictor of compression potential.) Our conclusions are that, at the current time, there is no storage or bandwidth advantage in using CRAM-compressed shotgun metagenomics over conventional gzipped FASTQ. We expect that this relates to the low saturation at this point of sequence space in our reference data sets. While we will no doubt continue to see ever growing isolate and metagenomics data and hence improved mappability, we expect that the impact of extremely large reference data will become a further factor in the power of compression. We therefore conclude this objective with the recommendation that CRAM compression is appropriate for shotgun metagenomics but does not give a compression advantage.

In order to link environments and geographical locations to biological entities and communities (objective 3), we have made use of the EMBL-EBI search infrastructure to implement more powerful browsing and searching. A search input box is present on all pages, allowing entry of free text (e.g. 'human') or colon-separated fields and values (e.g. 'experiment_type:amplicon'). Searches are subdivided into three levels: projects, samples and runs, as each level has different metadata available. The results are displayed in separate tabs and can be filtered by facets and numerical search controls, as appropriate for the data type. For example, run-level has the richest set of indexed facets that can be used for filtering, with Organism, GO-terms and InterPro annotations. The latter two can also be used as search terms, and the results can then be filtered by fields, such as temperature or depth. Using this search interface, it is possible to rapidly and easily narrow down datasets (for example, to discover all runs that contain antibiotic biosynthesis monooxygenase sequences in soil where Actinobacteria are found, determined using metatranscriptomics). To enable this type of search, we have used the GOLD database biome hierarchy (augmented where necessary) to provide an initial tree to categorise projects from which the data can be explored. Complementing this, we have developed clear and simple icons to represent the different biomes. On the EMG homepage, we present the most popular biomes, to allow quick, simple entry to the data collection. The biome information is curated by the EMG team, with the most specific biome from the hierarchy used. To provide a richer search and retrieval interface, we have developed a RESTful API, providing programmatic access to the data (https://www.ebi.ac.uk/metagenomics/api/v1/). We have utilised an interactive documentation framework (Swagger UI) to visualise and simplify interaction with the API's resources via an HTML interface. Detailed explanations of the purpose of all resources, along with many examples, are provided to guide end-users. Documentation on how to use the endpoints is available at https://www.ebi.ac.uk/metagenomics/api/docs/.

We have added additional information regarding metagenomic community diversity estimation, and information to allow comparisons between runs and samples. For each run and sample, we produce plots to graphically illustrate the taxa abundance distribution, and use OTU counts to compute diversity indices, including several estimates of the total diversity of the community sampled. These estimates are computed using R packages for community ecology, such as sads (https://CRAN.R-project.org/package=sads) and vegan (https://CRAN.R-project.org/package=vegan). Estimates are also computed at the level of sample, based on simple pooling of OTU counts from all runs in a sample. Additionally, estimates are provided for the number of individuals that would need to be sequenced in order to see a given fraction of the total population diversity (based on the assumption of an underlying Poisson-log-normal taxa abundance distribution). These provide guidance for the sequencing effort likely to be required for a more complete characterisation of the microbial community of interest. Additional diagnostic plots have also been added, which allow for the comparison of samples and runs within a study. In addition to a PCA plot for the identification of outlying runs, these provide a robust estimate of the fold-change difference in taxonomic composition between a reference sample (or run) and all other samples (or runs) in the study. Estimates of differences are computed using the DESeq2 software via the Bioconductor package phyloseq. These estimates are most robust for studies with replication in the form of multiple runs per sample.

Our move towards assembly of metagenomics data will greatly help with the identification of genome features, such as operons and/or secondary metabolite gene clusters.
Exploitation Route This project has been transformational in terms of data outputs. The resource is increasingly being used by the community for the analysis of their metagenomics datasets, thereby alleviating them of the need to develop and execute complex analysis pipelines. As well as the volume of data, the consistency of the results allows users to compare between projects. This allows the data to be reused and contextualised with new data, as it is produced. This resource and the data it contains forms the platform for longitudinal analysis, which may help mankind to evaluate the long-term impact of global warming or pollution. The production of a large-scale metagenomics-derived sequence resource provides new opportunities for both research academics and those involved in the industrial biotechnology sector to start mining the huge unexplored genetic potential that has been unveiled through this technique.
Sectors Agriculture, Food and Drink,Chemicals,Digital/Communication/Information Technologies (including Software),Environment,Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

URL http://www.ebi.ac.uk/metagenomics
 
Description The outputs from MGnify (formerly EBI metagenomics portal) have been used in a number of different ways beyond the original research project. One has been to assess biodiversity in light of pollution and climate change. The proteins that have been produced have also been used in a commercial setting to identify novel enzymes for bioremediation and the food industry. For example, Biocatalysts Ltd won the 2019 Queen's Award for Enterprise and Innovation for their MetXtra platform that was based on MGnify data. The proteins have also been used by Google DeepMind, as part of their AlphaFold2 de novo prediction methodology. This is generating huge interest in different sectors, especially the pharmaceutical and industrial biotechnology sectors.
First Year Of Impact 2015
Sector Agriculture, Food and Drink,Chemicals,Digital/Communication/Information Technologies (including Software),Energy,Environment,Healthcare,Manufacturing, including Industrial Biotechology,Culture, Heritage, Museums and Collections,Pharmaceuticals and Medical Biotechnology
Impact Types Societal,Economic,Policy & public services

 
Description BBSRC Microbiome Expert Working Group
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Participated in Microbiology Society Expert Working Group
Geographic Reach Multiple continents/international 
Policy Influence Type Participation in a guidance/advisory committee
 
Description UK-US Microbiome Report
Geographic Reach Multiple continents/international 
Policy Influence Type Implementation circular/rapid advice/letter to e.g. Ministry of Health
 
Description US-UK Microbiome Workshop at UCSD
Geographic Reach Multiple continents/international 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Finding value in complex biological data - integrated omics CR and D
Amount £256,656 (GBP)
Funding ID 102513, applicat no: 58051-433130 / VA: 75649 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 03/2016 
End 02/2018
 
Description H2020-INFRADEV
Amount £1 (GBP)
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 09/2015 
End 08/2018
 
Title ENA 
Description The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. A typical workflow includes the isolation and preparation of material for sequencing, a run of a sequencing machine in which sequencing data are produced and a subsequent bioinformatic analysis pipeline. ENA records this information in a data model that covers input information (sample, experimental setup, machine configuration), output machine data (sequence traces, reads and quality scores) and interpreted information (assembly, mapping, functional annotation). Data arrive at ENA from a variety of sources. These include submissions of raw data, assembled sequences and annotation from small-scale sequencing efforts, data provision from the major European sequencing centres and routine and comprehensive exchange with our partners in the International Nucleotide Sequence Database Collaboration (INSDC). 
Type Of Material Database/Collection of data 
Provided To Others? Yes  
Impact This ENA is the European arm of INSDC. However, ENA has specifically been extended to allow the deposition of metagenome assemblies, binned assemblies and metagenome assemblies. We have also worked on ensuring that metadata associated with sequence data are appropriately capture by the development of checklists. 
URL https://www.ebi.ac.uk/ena
 
Title MGnfiy (formerly called EBI metagenomics) 
Description The MGnify resources is an automated pipeline for the analysis and archiving of metagenomic data that aims to provide insights into the phylogenetic diversity as well as the functional and metabolic potential of a sample. It enables users to freely browse all the public data and associated analysis results that are contained within the resource. More recently (in 2018) we have started to provide metagenomics assembly as a service to the community, which is often not performed due to the computational overheads. 
Type Of Material Database/Collection of data 
Year Produced 2012 
Provided To Others? Yes  
Impact The MGnify provides access to some of the largest metagenomics projects and is the large collection of analysed metagenomic datasets. Uniquely, it enables the consistent analysis between projects enabling scientist to compare results to other datasets in the resource or to their own. 
URL https://www.ebi.ac.uk/metagenomics
 
Title MGnify (previously EBI Metagenomics Portal) 
Description MGnify, previously EBI Metagenomics, (https://www.ebi.ac.uk/metagenomics/) is a database of richly described shotgun metagenomics data sets from across sample environments. Drawing on user-submitted data, functional and taxonomic analysis pipelines provide systematic processing and analysis of data. Both input data and analysis outputs available freely in a variety of presentations and downloadable data formats. The database combines permanent archiving functions (through connectivity with public sequence databases) and state-of-the-art analysis methods. 
Type Of Material Database/Collection of data 
Year Produced 2012 
Provided To Others? Yes  
Impact The database is core to the MGnify programme, such that general programme impacts (see elsewhere in our outcome reporting for the programme) are all relevant to the database. 
URL https://www.ebi.ac.uk/metagenomics/
 
Title Metagenomic non-redundant protein database 
Description Database of protein sequences produced from assembly of metagenomic datasets. 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
Impact This database has supported the discovery of novel enzymes by an SME biotech company (BioCatalysts) as part of an InnovateUK BBSRC grant. 
URL https://www.ebi.ac.uk/metagenomics/sequence-search/search/phmmer
 
Description Newcastle University 
Organisation Newcastle University
Department School of Civil Engineering and Geosciences
Country United Kingdom 
Sector Academic/University 
PI Contribution Large analysed metagenomic datasets from a range of different biomes (terrabytes of sequences and data analyses).
Collaborator Contribution Developed R module to download and manipulate datasets. Developed tools and algorithms to help compare metagenomic datasets.
Impact The collaboration is multi-disciplinary and brings mathematicians and statisticians together with bioinformaticians to develop tools for the analysis of vast metagenomics datasets.
Start Year 2015
 
Description 2020 Annual Research Conference of the Pontificia Universidad Católica del Perú talk "Broadening our genomic knowledge of the human microbiome" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact PI Dr Rob Finn described the role of MGnify, including the resource's gut catalogue in microbiome research. He highlighted how Latin American samples were underrepresented. Finally, he provided advice on the different career paths available for researchers in bioinformatics.
Year(s) Of Engagement Activity 2020
 
Description 2020 POGO International Virtual Conference on the use of Environmental DNA (eDNA) in Marine Environments: Opportunities and Challenges talk "MGnify: An open and scalable platform for the analysis, discovery and dissemination of molecular based biodiversity data" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact PI Dr Rob Finn gave a talk during the 2020 POGO International Virtual Conference on the use of Environmental DNA (eDNA) in Marine Environments: Opportunities and Challenges. His talk focused on MGnify during day 2 of the conference; session on Data and Information. Session description is as follows: Through systems such as the International Nucleotide Sequence Database Collaboration (INSDC) and global standards like FASTA/Q format, the eDNA/omics community have benefitted from world-class data and information resources. However, our handling of what is, from our perspective, "metadata" and participation/interoperability with data systems from other disciplines is still in need of advancement. In the marine realm, we now have new opportunities to augment our digital capacities while aligning them with global digital strategies such as those within the UN Decade of Ocean Science for Sustainable Development. This session will explore some examples of how this is already taking place, and will welcome discussion on how we can collectively mainstream sequence data (as well as the information and knowledge derived from it) in the emerging digital ocean ecosystem.
Year(s) Of Engagement Activity 2020
URL https://pogo-ocean.org/capacity-development/activity-related-workshop/environmental-dna-edna-marine-...
 
Description 5th Microbiome Movement - Drug Development Europe conference talk "Magnifying the Human Gut Microbiome" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact During 5th Microbiome Movement - Drug Development Europe conference, PI Dr Rob Finn presented a talk on the unified human gut genome catalogue and phages, with a view to understanding the potential translational impact of human microbiome research.
Year(s) Of Engagement Activity 2021
URL https://microbiome-europe.com/?utm_source=hw-corporate&utm_medium=backlink&utm_campaign=brand-page
 
Description BiATA 2020 workshop on "Analysing metagenomic assemblies using MGnify" 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact In this two day remote tutorial provided by PI Dr Rob Finn and his team during the BiATA2020 conference, participants explored common approaches to analysing and annotating contigs produced from a metagenomics assembly. The course was a mixture of introductory lectures, followed by hands-on practicals. Due to time constraints, participants either investigated pre-calculated examples or used a web browser to explore outputs via the MGnify website (www.ebi.ac.uk/metagenomics). By the end of the course, participants understood how to process contigs, functionally and taxonomically characterise the contigs, and were able to generate metagenome assembled genomes from your assemblies.
Year(s) Of Engagement Activity 2020
URL http://biata2020.spbu.ru/workshop/
 
Description CSHL Biology of Genomes 2020 talk titled "Broadening our genomic knowledge of the human microbiome" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The Biology of Genomes 2020 meeting organised by the Cold Spring Harbor Laboratory addressed DNA sequence variation and its role in molecular evolution, population genetics and complex diseases, comparative genomics, large-scale studies of gene and protein expression, and genomic approaches to ecological systems. Both technologies and applications were emphasized. There was a special session on the ethical, legal and social implications (ELSI) of genome research. PI Dr Rob Finn chaired the session on Complex Traits and Microbiome and presented a talk.
Year(s) Of Engagement Activity 2020
URL https://meetings.cshl.edu/meetings.aspx?meet=GENOME&year=20
 
Description Cafe Sci talk "Gut bacteria and human health" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact This talk was presented as part of the Cafe Sci events, a public engagement initiative in Cambridge where people meet and explore the latest ideas in science and technology. Work on metagenome assembled genomes (MAGs) was presented here.
Year(s) Of Engagement Activity 2019
URL https://publicengagement.wellcomegenomecampus.org/sites/default/files/media/project/caf-sci-cambridg...
 
Description ComMet - Metagenomics community engaguement 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Series of activities including engagement with industry (Microbiomes and Metagenomes), running workshop (3 day training course) and providing support for community.
Year(s) Of Engagement Activity 2015
URL http://metagenomics.uk/
 
Description EBI Metagenomics for functional and taxonomic analysis - Indian webinar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Webinar with Indian university to describe the EBI Metagenomics resource and data sets.
Year(s) Of Engagement Activity 2016
 
Description EBI Metagenomics workshop & NextGenBUG meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Hands on workshop and seminar given, as part of the NextGenBUG (NextGen Bioinformatics User group) meeting at Edinburgh genomics. The morning session consisted of a hands on tutorial for ~30 attendees, followed by a talk at the meeting itself with ~200 attendees.
Year(s) Of Engagement Activity 2016
URL http://nextgenbug.org/next-meeting-edinburgh-6th-december-2016
 
Description EBI metagenomcs workshop 2017 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Organised and participated in the Metagenomics Bioinformatics training workshop at EBI, which involved lectures and hands-on sessions with world leaders in metagenomic data analysis. 30 students attended and got hands-on experience of data analysis and asked questions of the teaching faculty. Many reported that they will use the resources and analyses covered in their own research projects.
Year(s) Of Engagement Activity 2017
URL https://www.ebi.ac.uk/training/events/2017/metagenomics-bioinformatics-2
 
Description EMBL Course: Microbial Metagenomics: A 360° Approach 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Hands on session on EBI Metagenomics at EMBL Course: Microbial Metagenomics: A 360° Approach. Covering MGnify EBI metagenomics resource and how to analyse metagenomics data.
Year(s) Of Engagement Activity 2018
URL https://www.embl.de/training/events/2018/MET18-01/programme/index.html
 
Description EMBL Course: Microbial Metagenomics: A 360° Approach 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Hands on session on EBI Metagenomics at EMBL Course: Microbial Metagenomics: A 360° Approach. Helped raise awareness of the resource and stimulated questions on how best to analyse metagenomic data.
Year(s) Of Engagement Activity 2017
URL https://www.embl.de/training/events/2017/MET17-01/programme/index.html
 
Description EMBL Science Education (ELLS Heidelberg) tweet "Meet my microbiome 2020" 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Schools
Results and Impact Tweet from the official Twitter account of the EMBL Sceince Education (ELLS Heidelberg) inviting school teachers from Europe and beyond to learn about current research on the human microbiome and how to transfer this knowledge to their classrooms! Each module takes one week and is designed to fit the busy schedule of teachers! #MeetingMyMicrobiome2020
Year(s) Of Engagement Activity 2020
URL https://twitter.com/ELLS_Heidelberg/status/1314259917612736515
 
Description EMBL newsletter interview "Microbiomes take the stage at New Scientist Live" 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact PI Dr Robert Finn was interviewed for news.embl.de, the official newsletter of EMBL, in which he shared his insights on the public engagement work he's carried out. in particular, this interview focused on his featured talk at the New Scientist Live 2019 festival where he presented his team's work on acquiring novel insights into the human gut microbiome.
Year(s) Of Engagement Activity 2019
 
Description EMBL-EBI online tutorial "Metagenomics bioinformatics - A practical introduction" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This course covered the use of publicly available resources to manage, share, analyse and interpret metagenomics data, including marker gene, whole gene shotgun (WGS) and assembly-based approaches. It makes use of recorded lecutures and materials from the "Metagenomics Bioinformatics" training course that took place 17 - 20 July 2018 at EMBL-EBI. The recorded lecture material is aimed at life scientists working in the field of metagenomics who are in the early stages of their data analysis. These recordings are suitable for beginners with an undergraduate knowledge of metagenomics. The exercises included in this course are intended for an audience with experience of using bioinformatics in their research. A working knowledge of Unix command line and the R statistical package is required.
Year(s) Of Engagement Activity 2020
URL https://www.ebi.ac.uk/training/online/courses/metagenomics-bioinformatics/
 
Description EMBL-EBI training course "Metagenomics bioinformatics (virtual)" 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This course covered the metagenomics data analysis workflow from the point of newly generated sequence data. Participants explored the use of publicly available resources and tools to manage, share, analyse and interpret metagenomics data. The content included issues of data quality control and how to submit to public repositories. While sessions detailed marker-gene and whole-genome shotgun (WGS) approaches; the primary focus was on assembly-based approaches. Discussions also explored considerations when assembling genome data, the analysis that can be carried out by MGnify on such datasets, and what downstream analysis options and tools are available.
Year(s) Of Engagement Activity 2020
URL https://www.ebi.ac.uk/training/events/metagenomics-bioinformatics-virtual/
 
Description EMBL-EBI/WSI Seminar Series talk "The human microbiome beyond the gut" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact This seminar was presented as a part of the EMBL-EBI and Wellcome Sanger Institute joint monthly seminar series. PI Dr Rob Finn's talk focused on recent efforts to recover MAGs from the human skin microbiome, which not only harbours a very distinct microbial composition compared to the gut, but also carries additional challenges such as low DNA yield. Approaches were presented to overcome these challenges and some of the insights we have obtained into the microbial skin diversity.
Year(s) Of Engagement Activity 2020
URL https://www.ebi.ac.uk/about/events/seminars/2020/ebisanger-seminar-series-rob-finn-and-phil-jones-zo...
 
Description EMOSE 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Participation in a week long workshop analysing oceanocgraphic metagenomic data as part of the EMOSE project, involving leading researchers from around the globe.
Year(s) Of Engagement Activity 2017
URL https://www.euromarinenetwork.eu/EMOSE
 
Description European Learning Laboratory for the Life Sciences, ELLS blog "Introducing your microbiome" 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Schools
Results and Impact The European Learning Laboratory for the Life Sciences (ELLS), EMBL's education facility, invited secondary school science teachers to participate in a virtual training course in the autumn of 2020 entitled 'Introducing your microbiome'. The course was divided into four modules, providing an overview of current human microbiome research, introducing bioinformatics as a tool in microbiome research, and exploring microbiome research in health and disease. The final module consisted of group work in small teams, in which participants developed their own educational materials. The modules were taught by EMBL scientists PI Drs Rob Finn and Michael Zimmerman. The course was organised in collaboration with the Public Engagement officer at EMBL's European Bioinformatics Institute (EMBL-EBI) and was held entirely online.
Year(s) Of Engagement Activity 2020
URL http://emblog.embl.de/ells/virtual-llab-microbiome-2020/
 
Description HBIO 2017 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Hands on session at the HBIO2017, raising the awareness of EBI Metagenomics, comparing the resource to other resources, such as MG-RAST, and discussing the best way to analyse metagenomics data.
Year(s) Of Engagement Activity 2017
 
Description Metagenomics bioinformatics training course - EBI 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact 3 day training course providing training in metagenomics data analysis aimed at early stage researchers with international speakers; largely hands-on with lots of interaction and discussion.
Year(s) Of Engagement Activity 2016
URL https://www.ebi.ac.uk/training/events/2016/metagenomics-bioinformatics-1
 
Description Metagenomics webinar 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Webinar to explain metagenomics and some of the challenges associated with the field. 24 students attended the 'live' webinar and it has had 800+ views since, suggesting wider engagement.
Year(s) Of Engagement Activity 2007
URL http://www.ebi.ac.uk/training/online/course/ebi-metagenomics-analysing-and-exploring-metagenomics-da...
 
Description Metagenomics workshop - HBio 2016 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Presentation on EBI Metagenomics given at workshop as part of HBio 2016 (Hellenic Biosciences conference)
Year(s) Of Engagement Activity 2016
 
Description Metagenomics: from bench to data analysis 2016 - Earlham institute, Norwich 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Hands on course describing how EBI Metagenomics can be used to analyse and explore data.
Year(s) Of Engagement Activity 2016
 
Description Micro B3 OSD Analysis Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop that provided end users with accessed to their analyses data as part of the Ocean Sampling Day project. This has spawned many detailed analysis of the data due to be published in the coming year (2016/7). Provide useful feedback on the use of the metagenomics Portal
Year(s) Of Engagement Activity 2015
 
Description Moredun metagenomics workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Participation in the Metagenomics: approaches, methodology, analysis and practical applications at the Moredun Institute, helping raise the profile of EBI metagenomics and discuss some of the challenges of metagenomic data analysis.
Year(s) Of Engagement Activity 2017
URL https://www.moredun.org.uk/events/metagenomics-approaches-methodology-analysis-and-practical-applica...
 
Description National Veterinary research Institute course 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Hands on session at a workshop at National Veterinary research Institute, Poland, which brought together researchers interested in metagenomic data analysis. Lots of discussion on the best way to analyse such data.
Year(s) Of Engagement Activity 2017
 
Description Particiapte in panel at launch of Microbiology Society "unlocking the microbiome" policy document. 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact On 15 November 2017, the Microbiology Society launched its science policy report 'Unlocking the Microbiome', at an event held at The Royal Society in London. The event connected over 60 representatives from academia, industry, research funding and government to discuss opportunities and challenges of rapidly emerging field of microbiome science for health, agriculture and food, environment and biotechnology.
Year(s) Of Engagement Activity 2017
URL https://microbiologysociety.org/policy/microbiome-policy-project/unlocking-the-microbiome-launch-eve...
 
Description Participated in interview for the Gut Stuff - a program of video blogs talking about human gut microbiome. 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Video interview with the Mac Twins, explaining Big Data and relationship to microbiome. Released on YouTube as series of programs.
Year(s) Of Engagement Activity 2017
URL https://www.youtube.com/watch?v=NZTCWm63anQ
 
Description Presentation at ELIXIR All Hands meeting in Berlin. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presented a talk at the meeting titled "Marine Metagenomics- New resources and activities"
Year(s) Of Engagement Activity 2018
 
Description Presentation to Unilever 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact Presented an overview of sequence families team resources
Year(s) Of Engagement Activity 2018
 
Description Seminar at Edinburgh Bioinformatics Meeting 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Presented a seminar describing MAG research in human gut microbiome.
Year(s) Of Engagement Activity 2018
 
Description Seminar at Protein Srping Meeting, Prague 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presented a seminar titled "Bioprospecting for novel proteins in metagenomics datasets- fact or fiction".
Year(s) Of Engagement Activity 2018
URL https://www.pragueproteinspring.cz/scientific-program
 
Description Simultaneous analysis of eukaryotes, prokaryotes and viruses in marine metagenomic data 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation at a European microeukaryote conference, outlining the tools we have developed to analyse metagenomic datasets.
Year(s) Of Engagement Activity 2016
 
Description Summer school in Metagenomics 2016 - Pasteur institute 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Talk given as part of metagenomics summer school, sparking discussion and questions about EBI Metagenomics.
Year(s) Of Engagement Activity 2016
URL https://www.pasteur.fr/en/summer-school-2016-metagenomics
 
Description University of Warwick Seminar series talk "Broadening our genomic knowledge of human microbiomes" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Life Sciences seminar by Dr Rob Finn. He described his team's recently publication on the Unified Human Gastrointestinal Genome (UHGG) catalogue, which is an unprecedented collection of nearly 5,000 gut species found in the the gut microbiome, with 70% yet to be cultured. Dr Finn provided further details on the team's recent efforts to recover genomes from the human skin microbiome, which not only harbours a very distinct microbial composition compared to the gut, but also carries additional challenges such as low DNA yield. For both microbiomes, the team is currently investigating the microbiota beyond bacteria. An overview of these results were presented, assessing the challenges faced when researchers try to understand microbial community structures.
Year(s) Of Engagement Activity 2020
URL https://warwick.ac.uk/insite/events/events?calendarItem=8a17841b75f501d70175f549d9980163
 
Description WIRED article titled "A periodic table of your guts is the next step in the race to create a microbiome-feeding poop pill" 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact The group utilised MGnify to produce the Human Gastrointestinal Bacteria Culture Collection (HBC), a comprehensive set of 737 whole-genome-sequenced bacterial isolates, representing 273 species (105 novel species) from 31 families found in the human gastrointestinal microbiota. This resulted in the 2019 Nature Biotechnology publication titled "A human gut bacterial genome and culture collection for improved metagenomic analyses" [https://doi:10.1038/s41587-018-0009-7] which was featured by the WIRED magazine.
Year(s) Of Engagement Activity 2019
URL https://www.wired.co.uk/article/gut-bacteria-microbiome-pill