The MRC Consortium for Medical Microbial Bioinformatics

Lead Research Organisation: University of Warwick
Department Name: Warwick Medical School

Abstract

The UNIVERSITY OF WARWICK and SWANSEA UNIVERSITY, in partnership with the UNIVERSITY OF BIRMINGHAM and CARDIFF UNIVERSITY, propose a programme of capital expenditure, recruitment and training to create the MRC CONSORTIUM FOR MEDICAL MICROBIAL BIOINFORMATICS, a state-of-the-art interdisciplinary facility, led by Professor Mark Pallen (an MD PhD at Warwick) and Dr Sam Sheppard in Swansea, for use by the academic, industrial and healthcare communities that will enhance regional and national capability and infrastructure in microbial bioinformatics and improve our understanding of bacteria of medical importance.

RATIONALE: Microbial pathogens still present a MAJOR EXISTENTIAL THREAT to humanity. In addition, the HUMAN MICROBIOME - the rich and dynamic community of host-associated microorganisms and their genes - is now known to play a decisive role in the balance between health and disease, even in medical conditions not usually considered as microbial in origin (e.g. obesity). Harnessing medical bioinformatics to the study of microbial genes, genomes and metagenomes thus represents a DISTINCTIVE UNMET CHALLENGE and a UNIQUE FOCUS AMONG RESPONSES TO THIS MRC CALL. Rather than taking aim at the fixed, relatively tractable target of the human genome, we focus instead on genomic information derived from HUNDREDS OF BACTERIAL PATHOGENS and THOUSANDS OF COMMENSAL SPECIES: a distributed and dynamic system of MANY MILLIONS OF GENES, at least two orders of magnitude larger than the human gene set.

Our four research-active universities are located in neighbouring regions of the UK, thus providing an initial GEOGRAPHICAL COHESION to the Consortium that will facilitate community building and the exchange of ideas, and underpin the formalities of governance. At its inception, the Consortium will benefit from a NATIONAL AND GLOBAL REACH through a dense network of collaborations, collegiality and, through an application process for new partners, will soon grow in a scalable fashion to embrace a national remit.

The Consortium will build on TRACK RECORDS OF INDIVIDUAL RESEARCH EXCELLENCE and impressive INSTITUTIONAL INVESTMENT in medical microbiology and bioinformatics with all partners making a distinctive contribution. Through this initiative we will recruit THREE HIGHLY TALENTED INDIVIDUALS into microbial bioinformatics fellowships from careers outside the discipline or the country. We have LEVERAGED SUPPORT from the host organisations to place these research fellows on a TENURE TRACK. All three fellows will contribute to the goals of the Consortium through research, training and management roles as well as pushing forward their own cutting-edge research programmes.

Building on interests in parallelisation and cloud computing, we will develop a DISTRIBUTED COMPUTING INFRASTRUCTURE in Wales and the West Midlands that will provide an agile, scalable system for the UK microbiology research community.

We will develop an ambitious and exciting TRAINING PROGRAMME that will include bootcamps, hackathons, workshops, modules and courses, suitable for a wide range of users from professional bioinformaticians to undergraduate students.

We will strengthen regional, national and international microbial bioinformatics research through COMMUNITY-BUILDING ACTIVITIES, encouraging knowledge transfer and dissemination of best practice. Meetings of different sorts and scale will be held monthly and quarterly and annually. The Annual Meeting, with an academic and management component, will benefit from participation by our external Steering Group.

We will exploit pump-priming funds together with externally funded research activities to "stress our systems", confirming that the facilities that we have created work as planned and/or priming iterative refinements to our infra-structure. We are confident that the consortium will become self-sustaining through institutional commitments and the recruitment of additional research funding.

Technical Summary

The MRC CONSORTIUM FOR MEDICAL MICROBIAL BIOINFORMATICS (CMMB) will enhance UK capability and infrastructure in microbial bioinformatics. Led by Mark Pallen and Sam Sheppard, the CMMB will recruit THREE HIGHLY TALENTED INDIVIDUALS into UK medical microbial bioinformatics from careers outside the discipline or the country.

We will develop a DISTRIBUTED COMPUTING INFRASTRUCTURE that will operate together to provide:
1. AN AGILE, SCALABLE SYSTEM available to consortium members that can be dynamically provisioned to handle different workloads/projects as needed; this will include:
-FOUR CLUSTERS (one on each site) providing a heterogeneous mix of high-memory/low CPU and low-memory/high CPU servers suitable for both memory-intensive tasks (e.g. metagenomic assembly) and CPU-intensive activities.
-SUBSTANTIAL STORAGE (>2 petabytes).
-sufficient NETWORK CAPACITY for each site to operate at 10 gigabit/second connectivity.
2. a web-based instance of the GALAXY PLATFORM customised for medical microbial research.
3. a freely accessible DATABASE of relevant workflows, pipelines, scripts, programs, virtual machine images built on Galaxy/Github and mirrored across our sites.
4. a DATA ARCHIVE of relevant microbial (meta)genomes
5. a computational infrastructure for linking PATIENT METADATA with microbial (meta)genomic data.

We will develop an ambitious and exciting TRAINING PROGRAMME that will include bootcamps, hackathons, workshops, modules and courses, suitable for a wide range of users from professional bioinformaticians to undergraduate students. We will strengthen national microbial bioinformatics research through COMMUNITY-BUILDING ACTIVITIES, encouraging knowledge transfer and dissemination of best practice. We will exploit pump-priming funds together with externally funded research activities to "stress our systems", confirming that the facilities that we have created work as planned and/or priming ITERATIVE REFINEMENTS TO OUR INFRA-STRUCTURE

Planned Impact

This research will be of benefit to a range of beneficiaries outside the academic discipline of medical microbiology:

CLINICAL AND PUBLIC HEALTH MICROBIOLOGISTS, clinicians and INFECTION CONTROL TEAMS working within the HEALTH SERVICES, including regionally and locally with NHS TRUSTS in Wales and the West Midlands and further afield and nationally with Public Health England and Public Health Wales. These users will be able to use our computational infrastructure to integrate clinical informatics systems, patient metadata, epidemiological disease patterns and microbial (meta)genomic data to elucidate modes and routes of transmission, detect outbreaks, explore the relationships between potential pathogens and disease, with IMPACTS ON HEALTH and WELL-BEING, DISEASE PREVENTION, MANAGEMENT OF INFECTION AND QUALITY OF LIFE. This initiative will also bring new opportunities for productive engagement between the healthcare sector and the academic sector, so that research findings and approaches can be more easily TRANSLATED INTO OUTCOMES that impact on patient management.

INDUSTRIAL AND PUBLIC HEALTH USERS interested in developing NEW THERAPEUTICS, VACCINES OR DIAGNOSTIC TESTS for microbial pathogens by, for example, allowing them to explore genotypic diversity when evaluating novel targets. A greater understanding of the microbiota may also deliver new approaches for dealing with conditions not normally thought of ainfectiions, e.g inflammatory bowel disease, or obesity.

COMMERCIAL BENEFICIARIES include SEQUENCING COMPANIES, COMPUTER COMPANIES and PRIVATE LABORATORIES, who stand to benefit from increased demand for their products and opportunities for innovation and spread of best practice(NB: both Solexa and Oxford nanopore sequencing were developed within the UK, with benefits to our economy).

POLICY MAKERS, who will benefit from grounding their PUBLIC POLICY and LEGISLATION, e.g. on food safety, on a MORE SOLID UNDERSTANDING of bacterial evolution, epidemiology, population genetics and taxonomy. NB: Applicants, particularly Mark Achtman, have been at the forefront of efforts to identify and classify foodborne bacteria using a RATIONAL AND DISCRIMINATORY SYSTEM OF CLASSIFICATION.

The WIDER PUBLIC will benefit from the positive impacts on the NATION'S HEALTH,, including the CONTROL and PREVENTION OF INFECTION and the DEVELOPMENT of new interventions.

This work will also make a decisive contribution through employment and training to enhancing the PROFESSIONAL AND RESEARCH SKILLS BASE of the United Kingdom

Publications

10 25 50

publication icon
Cook R (2024) Decoding huge phage diversity: a taxonomic classification of Lak megaphages. in The Journal of general virology

 
Description Founding member and chair of the Public Health Alliance for Genomic Epidemiology Validation and QC working group
Geographic Reach Multiple continents/international 
Policy Influence Type Membership of a guideline committee
 
Description Founding member of the Public Health Alliance for Genomic Epidemiology Infrastructures working group
Geographic Reach Multiple continents/international 
Policy Influence Type Membership of a guideline committee
 
Description Member of the Genomics Partnership Wales IT Working Group
Geographic Reach National 
Policy Influence Type Membership of a guideline committee
Impact The IT working group has overseen the development of strategy for the implementation of the IT infrastructure to support genomics in healthcare in Wales. This has underpinned the development of clinical services based on genomics for the whole of Wales. In the pathogen area this has seen over 8,000 patient samples analysed using the infrastructure developed within the IT group.
 
Description Membership of Pathogen Genomics Operational Committee
Geographic Reach National 
Policy Influence Type Membership of a guideline committee
Impact This work has seen the development of 4 clinical services based on next generation sequencing. This has enabled the analysis of over 8,000 patient samples using NGS approaches, underpinned by software and infrastructure developed through UKRI funded research to Cardiff University. The systems that have been built have resulted in improvements in the speed of diagnoses/characterisation of pathogens, drops in cost of testing and increased generation of clinically actionable information from the tests undertaken.
 
Description SP3: Scalable Software for Pathogen Reads to Clinical Results using Next Generation Sequencing
Amount £1,000,000 (GBP)
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 03/2020 
End 02/2022
 
Title Additional file 1 of Author Correction: Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 5 A: Clustering of samples at 60% AAI to form genus clusters. Novel genera were defined as clusters of MAGs at 60% AAI which were not assigned a genus by GTDB-Tk. B: Protologues for the new Candidatus names. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_1_of_Author_Correction_Assembly...
 
Title Additional file 1 of Author Correction: Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 5 A: Clustering of samples at 60% AAI to form genus clusters. Novel genera were defined as clusters of MAGs at 60% AAI which were not assigned a genus by GTDB-Tk. B: Protologues for the new Candidatus names. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_1_of_Author_Correction_Assembly...
 
Title Additional file 10 of A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling 
Description Figure S10. Heatmap of arc heal species in the EM community. (ZIP 9 kb) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_10_of_A_comprehensive_benchmark...
 
Title Additional file 10 of A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling 
Description Figure S10. Heatmap of arc heal species in the EM community. (ZIP 9 kb) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_10_of_A_comprehensive_benchmark...
 
Title Additional file 2 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 2: Dataset 1. Average coverage of MAGs in all samples. Coverage was calculated by mapping MAG scaffolds to the adaptor trimmed Illumina reads for each sample. The average coverage of the scaffolds from a MAG within a sample were taken as the average abundance of that MAG in the sample. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_2_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 2 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 2: Dataset 1. Average coverage of MAGs in all samples. Coverage was calculated by mapping MAG scaffolds to the adaptor trimmed Illumina reads for each sample. The average coverage of the scaffolds from a MAG within a sample were taken as the average abundance of that MAG in the sample. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_2_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 2 of Crop management shapes the diversity and activity of DNA and RNA viruses in the rhizosphere 
Description Additional file 2: Supplementary Table S1. Sample metadata for all sample libraries. Supplementary Table S2. DNA viral population read counts. Supplementary Table S3. ssRNA viral population read counts. Supplementary Table S4. 16S rRNA OTU read counts. Supplementary Table S5. Viral gene read counts. Supplementary Table S6. PERMANOVA testing of contribution of soil compartment, crop rotation strategy and growth stage on viral and bacterial community composition. Supplementary Table S7. Mixed effect model output for linear relationship between the number of active vOTUs detected and relative host abundance. Supplementary Table S8. Mixed effect model output for linear relationship between the number of active vOTUs detected and bacterial community alpha diversity. Supplementary Table S9. Mixed effect model output for linear relationship between the number of active vOTUs detected and bacterial community alpha diversity. Supplementary Table S10. PERMANOVA testing of contribution of soil compartment and crop rotation strategy on viral community activity of viral fractions. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_Crop_management_shapes_the...
 
Title Additional file 2 of Crop management shapes the diversity and activity of DNA and RNA viruses in the rhizosphere 
Description Additional file 2: Supplementary Table S1. Sample metadata for all sample libraries. Supplementary Table S2. DNA viral population read counts. Supplementary Table S3. ssRNA viral population read counts. Supplementary Table S4. 16S rRNA OTU read counts. Supplementary Table S5. Viral gene read counts. Supplementary Table S6. PERMANOVA testing of contribution of soil compartment, crop rotation strategy and growth stage on viral and bacterial community composition. Supplementary Table S7. Mixed effect model output for linear relationship between the number of active vOTUs detected and relative host abundance. Supplementary Table S8. Mixed effect model output for linear relationship between the number of active vOTUs detected and bacterial community alpha diversity. Supplementary Table S9. Mixed effect model output for linear relationship between the number of active vOTUs detected and bacterial community alpha diversity. Supplementary Table S10. PERMANOVA testing of contribution of soil compartment and crop rotation strategy on viral community activity of viral fractions. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_Crop_management_shapes_the...
 
Title Additional file 2 of DESMAN: a new tool for de novo extraction of strains from metagenomes 
Description Separate text file containing variant prediction results in the complex mock. (TSV 4 kb) 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_DESMAN_a_new_tool_for_de_n...
 
Title Additional file 2 of DESMAN: a new tool for de novo extraction of strains from metagenomes 
Description Separate text file containing variant prediction results in the complex mock. (TSV 4 kb) 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_DESMAN_a_new_tool_for_de_n...
 
Title Additional file 2 of STRONG: metagenomics strain resolution on assembly graphs 
Description Additional file 2 Genomes used in the synthetic communities (Additional file 1: Tables S1 and S2). 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_STRONG_metagenomics_strain...
 
Title Additional file 2 of STRONG: metagenomics strain resolution on assembly graphs 
Description Additional file 2 Genomes used in the synthetic communities (Additional file 1: Tables S1 and S2). 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_STRONG_metagenomics_strain...
 
Title Additional file 3 of A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling 
Description Figure S3. Error Transition Probabilities for all platforms. (ZIP 34 kb) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_3_of_A_comprehensive_benchmarki...
 
Title Additional file 3 of A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling 
Description Figure S3. Error Transition Probabilities for all platforms. (ZIP 34 kb) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_3_of_A_comprehensive_benchmarki...
 
Title Additional file 3 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 3: Dataset S2. Description of each chicken MAG (metagenome-assembled genome), including novelty of species or strain, NCBI_name, GTDB-Tk taxonomy, CheckM completeness and contamination, assembly size (mb), N50, number of contigs, the longest contig length (bp) and the GC content. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_3_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 3 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 3: Dataset S2. Description of each chicken MAG (metagenome-assembled genome), including novelty of species or strain, NCBI_name, GTDB-Tk taxonomy, CheckM completeness and contamination, assembly size (mb), N50, number of contigs, the longest contig length (bp) and the GC content. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_3_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 3 of DESMAN: a new tool for de novo extraction of strains from metagenomes 
Description Separate text file containing strains used in the complex mock. (TSV 12 kb) 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_3_of_DESMAN_a_new_tool_for_de_n...
 
Title Additional file 3 of DESMAN: a new tool for de novo extraction of strains from metagenomes 
Description Separate text file containing strains used in the complex mock. (TSV 12 kb) 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_3_of_DESMAN_a_new_tool_for_de_n...
 
Title Additional file 4 of A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling 
Description Figure S4. Impact of platform and region on entropy. (ZIP 16 kb) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_4_of_A_comprehensive_benchmarki...
 
Title Additional file 4 of A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling 
Description Figure S4. Impact of platform and region on entropy. (ZIP 16 kb) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_4_of_A_comprehensive_benchmarki...
 
Title Additional file 4 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 4: Dataset S3. Taxonomy assigned by MAGpy to MAGs. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_4_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 4 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 4: Dataset S3. Taxonomy assigned by MAGpy to MAGs. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_4_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 5 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 5: Dataset S4. Clustering of samples at 60% AAI to form genus clusters. Novel genera were defined as clusters of MAGs at 60% AAI which were not assigned a genus by GTDB-Tk. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_5_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 5 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 5: Dataset S4. Clustering of samples at 60% AAI to form genus clusters. Novel genera were defined as clusters of MAGs at 60% AAI which were not assigned a genus by GTDB-Tk. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_5_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 6 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 6: Dataset S5. MAGs which were identified as being significantly more abundant by DESeq2 between diets and lines. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_6_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 6 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 6: Dataset S5. MAGs which were identified as being significantly more abundant by DESeq2 between diets and lines. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_6_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 7 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 7: Dataset S6. CAZymes which were identified as being significantly more abundant by DESeq2 between diets and lines. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_7_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 7 of Assembly of hundreds of novel bacterial genomes from the chicken caecum 
Description Additional file 7: Dataset S6. CAZymes which were identified as being significantly more abundant by DESeq2 between diets and lines. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/Additional_file_7_of_Assembly_of_hundreds_of_novel_bact...
 
Title Additional file 9 of A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling 
Description Figure S9. Impact of PCR cycles on OTUs. (ZIP 43 kb) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_9_of_A_comprehensive_benchmarki...
 
Title Additional file 9 of A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling 
Description Figure S9. Impact of PCR cycles on OTUs. (ZIP 43 kb) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_9_of_A_comprehensive_benchmarki...
 
Title Bayesian identification of bacterial strains from sequencing data 
Description Benchmarking data for bacterial strain identification as published in Microbial Genomics in the following article: 'Bayesian identification of bacterial strains from sequencing data' [DOI: 10.1099/mgen.0.000075] 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://microbiology.figshare.com/articles/dataset/Benchmarking_data_for_bacterial_strain_identifica...
 
Title Bayesian identification of bacterial strains from sequencing data 
Description Benchmarking data for bacterial strain identification as published in Microbial Genomics in the following article: 'Bayesian identification of bacterial strains from sequencing data' [DOI: 10.1099/mgen.0.000075] 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://microbiology.figshare.com/articles/dataset/Benchmarking_data_for_bacterial_strain_identifica...
 
Title DNA and RNA viruses in the rhizosphere 
Description This repository contains data used in Muscatt et al. 2022 Futher details on analysis can be found here https://github.com/GeorgeMuscatt/RhizosphereVirome Data is stored in the file RhizosphereVirome.tar The following files are stored. See the README for full details: c1.ntw.gz = vConTACT2 network output file core_protein_concatenation_tree = ssRNA phage phylogenetic tree based on aligned core protein concatenations CP.faa.gz = fasta amino acid file containing coat protein sequences for 11,222 near-complete ssRNA phage vOTUs CP_ref_Leviviricetes.faa.gz = fasta amino acid file containing coat protein sequences for 1,868 reference Leviviricetes genomes dsDNA_gene_annotations.csv.gz = annotations for 20,746 dsDNA vOTU genes dsDNA_vOTUs.faa.gz = fasta amino acid file containing 20,267 dsDNA vOTU genes dsDNA_vOTUs.fna.gz = fasta nucleotide file containing 1,059 dsDNA vOTUs edges.csv.gz = edges for drawing vConTACT2 network gene_2_genome.csv.gz = input file for vConTACT2 containing gene-to-genome index for 1,059 dsDNA vOTUs and 16,540 ssRNA phage vOTUs MP.faa.gz = fasta amino acid file containing maturation protein sequences for 11,222 near-complete ssRNA phage vOTUs MP_ref_Leviviricetes.faa.gz = fasta amino acid file containing maturation protein sequences for 1,868 reference Leviviricetes genomes nodes.csv.gz = nodes for drawing vConTACT2 network RdRp.faa.gz = fasta amino acid file containing RNA-dependent RNA polymerase sequences for 11,222 near-complete ssRNA phage vOTUs RdRp_ref_Leviviricetes.faa.gz = fasta amino acid file containing RNA-dependent RNA polymerase sequences for 1,868 reference Leviviricetes genomes ssRNA_vOTUs.faa.gz = fasta amino acid file containing 52,700 ssRNA phage vOTU genes ssRNA_vOTUs.fna.gz = fasta nucleotide file containing 16,541 ssRNA phage vOTUs viral_cluster_overview.csv = output file from vConTACT2 containing viral cluster information 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://leicester.figshare.com/articles/dataset/DNA_and_RNA_viruses_in_the_rhizosphere/19635336
 
Title Data from: Genome-wide identification of host-segregating epidemiological markers for source attribution in Campylobacter jejuni 
Description Campylobacter is among the most common worldwide causes of bacterial gastroenteritis. This organism is part of the commensal microbiota of numerous host species, including livestock, and these animals constitute potential sources of human infection. Molecular typing approaches, especially multi-locus sequence typing (MLST), have been used to attribute the source of human campylobacteriosis by quantifying the relative abundance of alleles, at 7 MLST loci, among isolates from animal reservoirs and human infection, implicating chicken as a major infection source. The increasing availability of bacterial genomes provides data on allelic variation at loci across the genome, providing the potential to improve the discriminatory power of data for source attribution. Here we present a source attribution approach based on the identification of novel epidemiological markers among a reference pan-genome list of 1810 genes identified through gene-by-gene comparison of 884 genomes of C. jejuni isolates from animal reservoirs, the environment and clinical cases. Fifteen loci, involved in metabolic activities, protein modification, signal transduction and stress response, or coding for hypothetical proteins, were selected as host-segregating markers and used to attribute the source of 42 French and 281 UK clinical C. jejuni isolates. Consistent with previous studies of British campylobacteriosis, analyses performed using STRUCTURE software, attributed 56.8% of British clinical cases to chicken, emphasizing the importance of this host reservoir as an infection source in the UK. However, among French clinical isolates, approximately equal proportions of isolates were attributed to chicken and ruminant reservoirs suggesting possible differences in the relative importance of animal host reservoirs and indicating a benefit for further national-scale attribution modelling to account for differences in production, behaviour and food consumption. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
URL https://datadryad.org/stash/dataset/doi:10.5061/dryad.m86k3
 
Title Data from: The landscape of realized homologous recombination in pathogenic bacteria 
Description Recombination enhances the adaptive potential of organisms by allowing genetic variants to be tested on multiple genomic backgrounds. Its distribution in the genome can provide insight into the evolutionary forces that underlie traits such as the emergence of pathogenicity. Here we examined landscapes of realized homologous recombination of 500 genomes from ten bacterial species, and found all species have 'hot' regions with elevated rates relative to the genome average. We examined the size, gene content and chromosomal features associated with these regions and the correlations between closely related species. The recombination landscape is variable and evolves rapidly. For example in Salmonella, only short regions of around 1kb in length are hot while in the closely related species Escherichia coli, some hot regions exceed 100kb, spanning many genes. Only Streptococcus pyogenes shows evidence for the positive correlation between GC content and recombination that has been reported for several eukaryotes. Genes with function related to the cell surface/membrane are often found in recombination hot regions but E. coli is the only species where genes annotated as "virulence associated" are consistently hotter. There is also evidence that some genes with "housekeeping" functions tend to be overrepresented in cold regions. For example, ribosomal proteins showed low recombination in all of the species. Among specific genes, transferrin binding proteins are recombination hot in all three of the species in which they were found, and are subject to inter-species recombination. 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://datadryad.org/stash/dataset/doi:10.5061/dryad.7t06c
 
Title GeneCatalog.faa 
Description proteins for chicken gene catalog 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://figshare.com/articles/dataset/GeneCatalog_faa/19582717
 
Title INPHARED_DATABASE 
Description inphared.pl (INfrastructure for a PHAge REference Database) is a perl script which downloads and filters phage genomes from Genbank to provide the most complete phage genome database possible.Useful information, including viral taxonomy and bacterial host data, is extracted from the Genbank files and provided in a summary table. Genes are called on the genomes using Prokka and this output is used to gather metrics which are summarised in the output files, as well as useful input files for vConTACT2. The data provided is all genomes up to Jan 2021. This can be downloaded so users do not have to repeat the process of consistent gene calling on existing genomes. The folder GenomesDB contains subfolders each containing a subfolder that is named on the accession number of each phage. Within each folder are re-called genes in the following format *.ffn*.faa The complete genome *fna and genbank file without any annotation *gbf See https://github.com/RyanCook94/ 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://leicester.figshare.com/articles/dataset/INPHARED_DATABASE/14242085
 
Title Local accessory gene sharing among Egyptian Campylobacter potentially promotes the spread of antimicrobial resistance 
Description Supplementary Material for 'Local accessory gene sharing among Egyptian Campylobacter potentially promotes the spread of antimicrobial resistance', as published in Microbial Genomics. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://microbiology.figshare.com/articles/dataset/Local_accessory_gene_sharing_among_Egyptian_Campy...
 
Title MOESM2 of The effect of DNA extraction methodology on gut microbiota research applications 
Description Additional file 2: Table S1. Sample raw data of read qualities and read lengths after each step of the pipeline and before OTUs construction. 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/MOESM2_of_The_effect_of_DNA_extraction_methodol...
 
Title MOESM2 of The effect of DNA extraction methodology on gut microbiota research applications 
Description Additional file 2: Table S1. Sample raw data of read qualities and read lengths after each step of the pipeline and before OTUs construction. 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/MOESM2_of_The_effect_of_DNA_extraction_methodol...
 
Title Supporting data for "PIRATE: A fast and scalable pangenomics toolbox for clustering diverged orthologues in bacteria" 
Description Cataloguing the distribution of genes within natural bacterial populations is essential for understanding evolutionary processes and the genetic basis of adaptation. Here we present a pangenomics toolbox, PIRATE (Pangenome Iterative Refinement And Threshold Evaluation), which identifies and classifies orthologous gene families in bacterial pangenomes over a wide range of sequence similarity thresholds. PIRATE builds upon recent scalable software developments to allow for the rapid interrogation of thousands of isolates. PIRATE clusters genes (or other annotated features) over a wide range of amino-acid or nucleotide identity thresholds and uses the clustering information to rapidly identify paralogous gene families and putative fission/fusion events. Furthermore, PIRATE orders the pangenome using a directed graph, provides a measure of allelic variation and estimates sequence divergence for each gene family. We demonstrate that PIRATE scales linearly with both number of samples and computation resources, allowing for analysis of large genomic datasets, and compares favorably to other popular tools. PIRATE provides a robust framework for analysing bacterial pangenomes, from largely clonal to panmictic species. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
 
Title The ecology of interspecies recombination among the zoonotic bacterium Campylobacter 
Description Horizontal gene transfer (HGT) can allow traits that have evolved in one bacterial species to transfer to another. This has potential to rapidly promote new adaptive trajectories such as zoonotic transfer or antimicrobial resistance. However, for this to occur requires gaps to align in barriers to recombination within a given time frame. Chief among these barriers is the physical separation of species with distinct ecologies in separate niches. Within the genus Campylobacter there are species with divergent ecologies, from rarely isolated single host specialists to multi-host generalist species that are among the most common global causes of human bacterial gastroenteritis. Here, by characterising these contrasting ecologies, we are able to quantify HGT among sympatric and allopatric species in natural populations. Analysing recipient and donor population ancestry among genomes from 30 Campylobacter species we show that cohabitation in the same host can lead to a 6-fold increase in HGT between species. This accounts for up to 30% of all SNPs within a given species and identifies highly recombinogenic genes with functions including host adaptation and antimicrobial resistance. As described in some animal and plant species, ecological factors are a major evolutionary force for speciation in bacteria and changes to the host landscape can promote partial convergence of distinct species through HGT. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://figshare.com/articles/dataset/The_ecology_of_interspecies_recombination_among_the_zoonotic_b...
 
Title The ecology of interspecies recombination among the zoonotic bacterium Campylobacter 
Description Horizontal gene transfer (HGT) can allow traits that have evolved in one bacterial species to transfer to another. This has potential to rapidly promote new adaptive trajectories such as zoonotic transfer or antimicrobial resistance. However, for this to occur requires gaps to align in barriers to recombination within a given time frame. Chief among these barriers is the physical separation of species with distinct ecologies in separate niches. Within the genus Campylobacter there are species with divergent ecologies, from rarely isolated single host specialists to multi-host generalist species that are among the most common global causes of human bacterial gastroenteritis. Here, by characterising these contrasting ecologies, we are able to quantify HGT among sympatric and allopatric species in natural populations. Analysing recipient and donor population ancestry among genomes from 30 Campylobacter species we show that cohabitation in the same host can lead to a 6-fold increase in HGT between species. This accounts for up to 30% of all SNPs within a given species and identifies highly recombinogenic genes with functions including host adaptation and antimicrobial resistance. As described in some animal and plant species, ecological factors are a major evolutionary force for speciation in bacteria and changes to the host landscape can promote partial convergence of distinct species through HGT. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://figshare.com/articles/dataset/The_ecology_of_interspecies_recombination_among_the_zoonotic_b...
 
Description Collaboration with Public Health England 
Organisation Public Health England
Country United Kingdom 
Sector Public 
PI Contribution The CLIMB infrastructure was used by PHE for a few days to support their public health microbial genomics efforts when their own infrastructure went offline
Collaborator Contribution PHE provided input into the MRC Partnership award application which will prvide follow on funding for this project
Impact Support for public health genomics
Start Year 2020
 
Description Collaboration with Public Health Wales 
Organisation Public Health Wales NHS Trust
Country United Kingdom 
Sector Public 
PI Contribution Provision of cloud computing capacity and software
Collaborator Contribution 500k covering staff and hardware to provide microbial bioinformatics capacity to NHS Wales for 5 year
Impact PHW gains compute capacity for performing microbial genomics analyses.
Start Year 2019
 
Description Workstream leadership in the Public Health Alliance for Genomic Epidemiology 
Organisation Bill and Melinda Gates Foundation
Country United States 
Sector Charity/Non Profit 
PI Contribution We are providing leadership around the Research Infrastructure and the Validation and QC workstreams. This involves developing outputs around best practice associated with these workstreams, as well as holding/managing meetings and contributing to the development of the overall programme. Contributions included initial input into the drafting of the founding documents of PHA4GE.
Collaborator Contribution We have helped to shape the project from the start, contributing expertise to build the alliance and we are now heavily engaged with the development of two workstreams where we are playing a leading role
Impact none as yet.
Start Year 2019
 
Description Workstream leadership in the Public Health Alliance for Genomic Epidemiology 
Organisation Cardiff University
Department School of Biosciences
Country United Kingdom 
Sector Academic/University 
PI Contribution We are providing leadership around the Research Infrastructure and the Validation and QC workstreams. This involves developing outputs around best practice associated with these workstreams, as well as holding/managing meetings and contributing to the development of the overall programme. Contributions included initial input into the drafting of the founding documents of PHA4GE.
Collaborator Contribution We have helped to shape the project from the start, contributing expertise to build the alliance and we are now heavily engaged with the development of two workstreams where we are playing a leading role
Impact none as yet.
Start Year 2019
 
Description Workstream leadership in the Public Health Alliance for Genomic Epidemiology 
Organisation Centers for Disease Control and Prevention (CDC)
Country United States 
Sector Public 
PI Contribution We are providing leadership around the Research Infrastructure and the Validation and QC workstreams. This involves developing outputs around best practice associated with these workstreams, as well as holding/managing meetings and contributing to the development of the overall programme. Contributions included initial input into the drafting of the founding documents of PHA4GE.
Collaborator Contribution We have helped to shape the project from the start, contributing expertise to build the alliance and we are now heavily engaged with the development of two workstreams where we are playing a leading role
Impact none as yet.
Start Year 2019
 
Description Workstream leadership in the Public Health Alliance for Genomic Epidemiology 
Organisation University of the Western Cape
Country South Africa 
Sector Academic/University 
PI Contribution We are providing leadership around the Research Infrastructure and the Validation and QC workstreams. This involves developing outputs around best practice associated with these workstreams, as well as holding/managing meetings and contributing to the development of the overall programme. Contributions included initial input into the drafting of the founding documents of PHA4GE.
Collaborator Contribution We have helped to shape the project from the start, contributing expertise to build the alliance and we are now heavily engaged with the development of two workstreams where we are playing a leading role
Impact none as yet.
Start Year 2019
 
Title VAPOR 
Description We built a graph-based classifier, VAPOR, for selecting mapping references, assembly validation and detection of strains of Influenza of non-human origin. Standard human reference viruses were insufficient for mapping diverse influenza samples in simulation. VAPOR was built to retrieve references for viral genomes to enable read recovery from whole genome sequencing data. Using VAPOR instead of other existing approaches, VAPOR increased the proportion of mapped reads by up to 13.3% in testing compared to other software using standard references. VAPOR has the potential to improve the robustness of bioinformatics pipelines for surveillance and could be adapted to other RNA viruses. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact This work has underpinned the development of the clinical genomics pipeline for Influenza, in use in Public Health Wales now. 
URL https://github.com/connor-lab/vapor
 
Description Bacteriophage Genome Annotation Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A workshop was held at Warwick for a mixed audience of postgraduate students, post docs and PIs using the CLIMB infrastructure and know how to teach the fundamentals of genome annotation. More people signed up to have a CLIMB account after the workshop.
Year(s) Of Engagement Activity 2017
 
Description Balti and Bioinformatics 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Each year one of the CLIMB fellows runs a bioinformatics course at Birmingham University with the aim of building a community forum for bioinformaticians to talk openly about problems they are encountering, and get the benefit of a diverse group of people in discussions. This has led to connections made that have extended beyond the meetings.
Year(s) Of Engagement Activity 2015,2016
 
Description Basic Microbial Bioinformatics Workshop in Vietnam 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Undergraduate students
Results and Impact Elizabeth Batty and Phil Ashton ran a one-week workshop in Ho Chi Minh City, Vietnam, on March 25-29, with participants from the Oxford University units across south-east Asia.
They aimed to teach the basics of microbial bioinformatics to complete beginners to bioinformatics, and everything ran on CLIMB.
They had 32 participants from six countries.
Year(s) Of Engagement Activity 2019
 
Description Bath University HPC Symposium keynote 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A talk about the CLIMB project: how the cloud was built and the service that the CLIMB project provides for the academic community.
Year(s) Of Engagement Activity 2016
 
Description Bioinformatics workshop in Norwich 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact A bioinformatics workshop was run in Norwich using the MRC-CLIMB infrastructure. Attendants' excitement about the platform resulted in more users and more accesses to CLIMB from Norwich Research Institutions.
Year(s) Of Engagement Activity 2017
 
Description CLIMB AT THE SMBE SATELLITE MEETING IN ASSAM, INDIA 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact MRC CLIMB project was presented at the SMBE Satellite meeting "Evolution of microbes in natural and experimental populations" . The meeting took took place in Assam (India) on 14th-16th December 2017, and was hosted by Dr. Siddartha Satapathy and Prof. Suvendra Ray (Tezpur University, Assam, India).During the MRC CLIMB session, Sam Sheppard and Sion Bayliss (both from University of Bath) introduced the MRC CLIMB project, described the infrastructure and detailed the Virtual Machine provisioning model. Sion Bayliss gave an introduction to methodologies used in the analysis of whole genome sequence data for microbial genomics. Dr. Harry Thorpe (University of Bath) provided a video walk-through of a data analysis project, from raw data to phylogenetic tree, by way of a user testimonial for CLIMB. MRC-CLIMB project was well welcomed, as researchers could recognize the power of cloud computing in bioinformatics analysis, the reduced cost it can represent for institutions, and the easy-to-use solutions that CLIMB implemented to facilitate the job for microbiologists. Manifestation of interest in how to build a CLIMB-like infrastructure is one of the best result of this activity.
Year(s) Of Engagement Activity 2017
 
Description CLIMB WORKSHOP AT THE OXFORD UNIVERSITY CLINICAL RESEARCH UNIT, HO CHI MINH 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact A successful MRC CLIMB workshop in Vietnam introduced attendees to CLIMB and to work with their own virtual machine. Participants enthusiastically used CLIMB and its tools to assemble bacterial genomes, and the discussions after the meeting highlighted the importance of a CLIMB-like infrastructure to empower scientific research. New contacts and collaborations were created.
Year(s) Of Engagement Activity 2017
 
Description CLIMB at the RCUK-WG in London 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact On January 8, 2018, CLIMB was introduced at the third RCUK Cloud Working Group at the Francis Crick Institute in London. CLIMB adopted cloud solutions for microbial research, and the public well received the approach adopted by the Project. CLIMB approach saves time, money, and effort of researchers making scientific research faster and by providing researchers with tools they wouldn't have access to by their own. CLIMB supports almost 300 research groups in the UK. Despite the power of cloud, many biologists find that using cloud resources is quite complex to their analyses (e.g. installing software stacks). MRC-CLIMB ideated and created an appropriate infrastructure that allows researchers to rapidly and easily access preconfigured instances (designed on their needs), on demand.
Year(s) Of Engagement Activity 2018
 
Description CLIMB launch event 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact In July 2016 the CLIMB team held a launch event to introduce CLIMB (the project) and the bioinformatics service offered to UK academics. The audience was a mix from principal investigators to post graduate students. There was also participation from industrial partners and Public Health organisations.
This two day event included a demonstration on how to sign up to and use CLIMB and also talks from researchers who have used CLIMB., talking about how CLIMB has helped their research.
After the event the number of groups signed up to use CLIMB increased significantly with over 200 groups from institutes across the UK signed up. Individuals regularly communicate on the CLIMB forum with advice and support on how to use CLIMB.
The service that CLIMB offers has enabled researchers in the UK to perform bioinformatics work that they were previously unable to do due to lack of compute resource and know how. It has also been a useful aid to the teaching of undergraduate students. Below is a quote from one of the CLIMB users:

'Yesterday's practical went wonderfully. We started the day with a big blob of raw data and ended up with 12 fully assembled Staphylococcus genomes and nearly 90 happy (we hope) undergrad students. We didn't have a single issue with any of the VMs.

Massive thank you to you for setting this up! It is amazing you guys can offer this service to the community for free.'
Year(s) Of Engagement Activity 2016
URL http://www.climb.ac.uk/climb-launch/
 
Description CLIMB presented in Norway 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Mixed academic audience attended the National Consortium for Microbial Genomics Meeting (Norwegian Institute of Public Health, Lovisenberggata 8, Oslo, Norway) on December 7, 2017. The MRC-CLIMB infrastructure and the applications of the cloud in population genomics of bacterial pathogens in the Cloud received a warm welcome by the audience. The attendants reported a great interest in the infrastructure and recognized the value of the cloud system for microbial bioinformatics.
Year(s) Of Engagement Activity 2017
 
Description DEVELOPING PIPELINES FOR BACTERIAL EVOLUTIONARY GENOMICS 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact A workshop about the latest techniques for analysis of populations of bacterial genomes and how the CLIMB project can be used for this.
Year(s) Of Engagement Activity 2017
 
Description Dell-Intel understanding the challenges of HPC 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact A talk to a mixed audience about the CLIMB team's experience in building a cloud system.
Year(s) Of Engagement Activity 2016
 
Description EBAME seminar on "Emerging Bioinformatics Applications for Microbial Ecogenomics 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact A talk to a mixed audience on how bioinformatics and the CLIMB infrastructure can be used for studying Microbial Ecogenomics.
Year(s) Of Engagement Activity 2016
 
Description EVELIEN ADRIAENSSENS USED CLIMB FOR A WORKSHOP IN THE USA 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact CLIMB supports training activities organized by its users. In August 2019, CLIMB provided computational capacity for a bioinformatics workshop in Olympia, WA, USA, organized by Evelien Adriaenssens, from the Quadram Institute Bioscience (Norwich, UK) and Alejandro Reyes Muñoz from the Universidad de los Andes (Bogotá, Colombia).

Here is Evelien's description of the workshop:

"We used CLIMB for a bioinformatics workshop which was part of the Evergreen International Phage Biology Meeting at the Evergreen State College in Olympia, WA, USA (http://blogs.evergreen.edu/phage/about/meetings/2019-2/). The workshop was 4 hours long, repeated on the 3rd and 4th of August 2019.

The topic of the workshop was "Viromics" or viral metagenomics, co-organised by Alejandro Reyes Muñoz from the Universidad de los Andes, Bogotá, Colombia, and supported as teaching assistants by three of his graduate students.

We covered quality control, assembly, read mapping and exploration of the viral content of the metagenome using a mock published dataset, using state-of-the-art free software tools.

Because of support by CLIMB, we were able to give all participants a hands-on experience and they were able to go through the whole workflow themselves. Feedback has been very positive and we are looking to organise this workshop again in two years at the next Evergreen Phage Meeting.

Our guide document can be found here: https://tinyurl.com/evergreen-viromics".
Year(s) Of Engagement Activity 2019
URL https://www.climb.ac.uk/evelien-adriaenssens-used-climb-for-a-workshop-in-the-usa/
 
Description Engagement with Genomics Partnership Wales meeting with Bahraini delegation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Introducing pathogen genomics to a delegation from Bahrain who are working to plan for setting up a genomics programme
Year(s) Of Engagement Activity 2019
 
Description Genome Science talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A talk about the CLIMB project: how the cloud was built and the service that the CLIMB project provides for the academic community.
Year(s) Of Engagement Activity 2015,2016
 
Description Hackathon: CLIMB, Big Data and Public Health Microbiology 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A hackathon was held at Warwick University with representatives from academia and Public Health England. The objective was to investigate how analysis of Big Data can help inform public health microbiology. The CLIMB infrastructure was used for the hackathon and the CLIMB fellows were involved in teaching. Since the hackathon there has been further collaboration with CLIMB and PHE.
Year(s) Of Engagement Activity 2015
URL http://www.climb.ac.uk/hackathon-climb-big-data-and-public-health-microbiology/
 
Description Hackathon: Common Bacterial Genome Analyses 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A workshop to develop the bioinformatics community and to look at Common Bacterial Genome Analyses using the CLIMB infrastructure. Led to ongoing connections and more people singing up to use the CLIMB service.
Year(s) Of Engagement Activity 2016
 
Description Impact showcase talk at Supercomputing, Denver 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact This talk introduced how the application of HPC/Cloud computing is being used to derive insight from genomics to track and treat pathogens. The venue for the talk was the Supercomputing conference, one of the largest conferences for computer science in the world, with many of the audience being from industry and other areas. Further engagement with industry followed the talk.
Year(s) Of Engagement Activity 2019
 
Description Introduction to CLIMB for microbial bioinformatics - Leicester 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact The seminar was run by Andy Millard on February 16, 2018, at the University of Leicester. The primary goal was to introduce the system and to get the microbiologists at Leicester using the MRC-CLIMB infrastructure. The easy-to-use CLIMB, together with the softwares that it hosts, was well received and the access from Leicester to CLIMB increased since then. We stride at providing a tool that researchers can use for free, that gives them all the assistance they need in running the analyses and that satisfies the needs of microbiologists. Bioinformatics is a bottle-neck for microbial research, because microbiologists don't have the tools and the expertise to analyze and interpret their data. The MRC-CLIMB project aims at providing them with this expertise, and more microbiologists are using CLIMB after the seminar. Moreover, having more people using CLIMB helps us improving the system, as we want to meet the needs of researchers day by day.
Year(s) Of Engagement Activity 2018
 
Description Introduction to Microbial Bioinformatics Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Undergraduate students
Results and Impact The Heriot-Watt University and MRC-CLIMB Project organized the workshop "Introduction to Microbial Bioinformatics", which featured talks from Peter C. Morris, Leena Kerr, Mark J. Pallen and Sion Bayliss.
Hosts: Leena Kerr and Jennifer Pratscher - Heriot-Watt University, Edinburgh
Event date and time: May 13, 2019 - from 1pm to 5pm
Venue: Hertiot-Watt University

Topics:
-Metagenomics for diagnosis and discovery
-Use of CLIMB
-Microbial bioinformatics on CLIMB
Year(s) Of Engagement Activity 2019
URL https://www.climb.ac.uk/introduction-to-microbial-bioinformatics-workshop/
 
Description Introductory Workshop on Microbial Community Bioinformatics 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact A workshop was held at Warwick by some of the CLIMB team to teach Microbial bioinformatics to postgraduate students.
Year(s) Of Engagement Activity 2016
 
Description Invited Talk at University of Notre Dame about CLIMB 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A talk about the CLIMB project: how the cloud was built and the service that the CLIMB project provides for the academic community.
Year(s) Of Engagement Activity 2016
 
Description MRC CLIMB workshop at the MRC Unit, Gambia 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Some of the CLIMB team ran a introductory bioinformatics course using CLIMB at the MRC Gambia Unit. This builds on a long standing collaboration between researchers at Warwick and the MRC unit in the Gambia. The audience were a mix of students, postdocs and PIs, who had a greater understanding of bioinformatics and its importance after the course.
Year(s) Of Engagement Activity 2017
 
Description MRC-CLIMB workshop in The Gambia, 2018 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact A CLIMB delegation ran a bioinformatics workshop in The Gambia, in collaboration with the MRC Unit in Fajara (The Gambia), that hosted the course. One day (Jan.22) was dedicated to set up the course (computer performance test, virtual machine, workshop simulation), then 21 students attended the three-day workshop (January 23-25) and were introduced to bioinformatics, cloud computing, Illumina and Nanopore sequencing, microbial genomics and metagenomics, phylogenesis and phylogenetic trees. Aims of the course were defined with Gambian tutors, and the activities were planned accordingly.
The success of the workshop is well documented in a survey filled by participants at the end of the course: quality of presentations, assistance during the workshop, relevance for their research were reported to be very good or excellent. The MRC-CLIMB Unit in The Gambia is interested in making it an annual appointment.
Year(s) Of Engagement Activity 2018
 
Description Microbial Data Analysis Hands-on Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Undergraduate students
Results and Impact This Bioinformatics Hands-on Workshop on the MRC Cloud Infrastructure for Micorbial Bioinformatics was hosted by Leena Kerr and Jennifer Pratscher from the Heriot-Watt University (Edinburgh).

Target: all those who wanted to gain insight into Linux command line, basics of genome assembly, and variant calling and tree building. The course consisted of theory and hands-on sessions.

When and where: May 14th, 2019 at William Arrol Biolding (Heriot-Watt University, Edinburgh)
Year(s) Of Engagement Activity 2019
URL https://www.climb.ac.uk/microbial-data-analysis-workshop/
 
Description PHW Antimicrobial Stewardship meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Participation in a workshop about antimicrobials. Discussed how genome analysis using the CLIMB system could contribute to their development. Led to ongoing connections made and an increased interest in the CLIMB service.
Year(s) Of Engagement Activity 2016
 
Description Presentation at OpenStack Summit 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact One of the CLIMB investigators presented at a compute infrastructure meeting in Texas. The audience were keen to hear and learn about the experience the CLIMB team had putting together a cloud using OpenStack (cloud computing software). Ongoing connections were made at the meeting.
Year(s) Of Engagement Activity 2016
 
Description Presentation to Welsh Assembly Comittee 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact This talk introduced pathogen genomics, and how it is enabled to a Welsh Assembly Committee, stimulating increased interest in this subject area and informing the committee on the uses of computational resources for the processing of pathogen genomic data.
Year(s) Of Engagement Activity 2019
 
Description Press release about our project infrastructure 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A press release in a publication read by both academia and industry about the CLIMB team's experience on building the CLIMB infrastructure and the service that it will provide to UK academia.
Year(s) Of Engagement Activity 2015
URL http://edtechnology.co.uk/Article/he-collaborates-for-cloud-hpc-system
 
Description Press release about our project infrastructure in an IT journal 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A press release in a publication read by both academia and industry about the CLIMB team's experience on building the CLIMB infrastructure and the service that it will provide to UK academia.
Year(s) Of Engagement Activity 2015
URL http://primeurmagazine.com/weekly/AE-PR-11-15-25.html
 
Description Pubic Health Wales Genomic Epidemiology workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact A workshop including people from across Public Health Wales including epi and lab-based researchers, to discuss the implications and limitations of genomics for health protection.
Year(s) Of Engagement Activity 2016
 
Description Public Health Wales Genomics awareness sessions 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Public Health Wales Genomics awareness sessions including coverage of the software/computational challenges and a brief introduction to CLIMB. Led to further collaborations and more people signing up to use CLIMB.
Year(s) Of Engagement Activity 2016
 
Description RedHat White Paper on CLIMB 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact A talk to a mixed audience about the CLIMB team's experience in building a cloud system.
Year(s) Of Engagement Activity 2016
 
Description Strategies and Techniques for Analyzing Microbial Population Structure 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact A talk for a mixed academic audience looking at how bioinformatics and the CLIMB infrastructure can be used for Strategies and Techniques for Analyzing Microbial Population Structure
Year(s) Of Engagement Activity 2016
 
Description Talk at BMFZ meeting in Düsseldorf: Genomics and metagenomics in medical microbiology: opportunities and challenges 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A talk to a mixed academic audience about the CLIMB infrastructure can be used for analysing genomics and metagenomics in medical microbiology. This led to requests for a CLIMB account.
Year(s) Of Engagement Activity 2015
URL https://www.youtube.com/watch?v=aHY-LKvI0xs
 
Description Talk at BioData World Congress 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A talk to a mixed audience (academia and industry on the CLIMB team's experience on building the cloud infrastructure.
Year(s) Of Engagement Activity 2015
 
Description Talk at Doherty Institute, Melbourne, Australia 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact A talk about the CLIMB project: how the cloud was built and the service that the CLIMB project provides for the academic community.
Year(s) Of Engagement Activity 2016
 
Description Talk at NHS meeting on pathogen genomics 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A talk introducing bioinformatics for the analysis of pathogen genomic sequences in healthcare
Year(s) Of Engagement Activity 2019
 
Description Talk at STFC Computing Insight UK 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A talk to a mixed audience (academia and industry on the CLIMB team's experience on building the cloud infrastructure.
Year(s) Of Engagement Activity 2015
URL http://www.stfc.ac.uk/news-events-and-publications/events/computing-insight-uk-2015/
 
Description Talk at the Universities and Colleges Information Systems Association (UCISA) Infrastructure Group meeting in Oxford 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A talk and discussion about the CLIMB infrastructure.
Year(s) Of Engagement Activity 2015
 
Description Talk for Dell: Accelarating Understanding in HPC 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact A talk to a mixed audience about the CLIMB team's experience in building a cloud system.
Year(s) Of Engagement Activity 2016
 
Description Talk: Applied Bioinformatics and Public Health 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact A talk and discussion about how CLIMB can be used for bioinformatics, and how public health can use big data analysis to inform new policies.
Year(s) Of Engagement Activity 2015
 
Description Talk: RedHat Storage Seminar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact A talk to a mixed audience about the CLIMB team's experience in building a cloud system.
Year(s) Of Engagement Activity 2017
 
Description Teaching Bacterial Genomics as part of MRes Data Handling and Statistics module 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Undergraduate students
Results and Impact CLIMB was used to teach undergraduate students in Cardiff about bacteria genomics
Year(s) Of Engagement Activity 2015,2016
 
Description Teaching Introduction to Bioinformatics as part of integrated MBiol 3rd year project module 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Undergraduate students
Results and Impact CLIMB was used to teach undergraduate students in Cardiff about bioinformatics
Year(s) Of Engagement Activity 2016
 
Description Work with Public Health Wales to develop a genomics service based upon virtualisation approaches 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Work with Public Health Wales to develop a genomics service based upon virtualisation approaches using the CLIMB infrastructure
Year(s) Of Engagement Activity 2016,2017
 
Description Workshop "ESTABLISHING MICROBIAL SEQUENCING AND BIOINFORMATICS CAPACITY IN CHALLENGING ENVIRONMENTS" 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact WHEN AND WHERE
MRC Unit in The Gambia at LSHTM - Fajara
January 14-17, 2020

Hosts and Organisers
This workshop was coordinated by Mark Pallen (Quadram Institute Bioscience and MRC-CLIMB) and Martin Antonio (MRC Unit in The Gambia and LSHTM - The London School of Hygiene & Tropical Medicine). It was supported by the MRC Unit in The Gambia, the Quadram Institute Bioscience, the Univeristy of Surrey and the MRC-CLIMB Project.

Rationale
High-throughput sequencing is transforming microbiology across the world in academic and clinical settings. However, the capacity to perform microbial sequencing and bioinformatics is distributed unevenly, with many groups working in challenging environments finding it hard to get started or facing tough questions as to what to do locally and what to outsource.
This workshop brought together microbiologists and bioinformaticians from across Europe, Asia and Africa to address these problems, while also exchanging practical know-how and building collaborative networks.
Year(s) Of Engagement Activity 2020
URL https://www.climb.ac.uk/report-on-establishing-microbial-sequencing-and-bioinformatics-capacity-in-c...
 
Description Workshop at the Society for Applied Microbiology 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact A bioinformatics workshop was run at the Society for Applied Microbiology during the 6th ECS research symposium. University of Westminster, 19 april 2017. After a CLIMB demo, attendants used the infrastructure to perform fast analysis. CLIMB gained new users during the symposium, and more new accounts were created after the workshop.
Year(s) Of Engagement Activity 2017
 
Description Workshop on WGS in healthcare 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I spoke at, and engaged with a workshop organised at RIVM in the Netherlands to introduce the pathogen genomics work that is going on in Wales. The engagement provoked detailed discussions following the event, with both researchers and with clinical staff.
Year(s) Of Engagement Activity 2020
 
Description training school on (un)targeted metagenome analysis 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact A workshop for a mixed academic audience looking at how bioinformatics and the CLIMB infrastructure can be used for studying metagenomics
Year(s) Of Engagement Activity 2016