BBSRC Institute Strategic Programme: Decoding Biodiversity (DECODE) - Partner Grant

Lead Research Organisation: UK CENTRE FOR ECOLOGY & HYDROLOGY
Department Name: Soils and Land Use (Wallingford)

Abstract

The start of the 21st Century saw the landmark publication of the human genome, changing the way we do biology and having a huge impact on medicine. This heralded a new era of genomics that was initially dominated by the generation and analysis of genomes of model organisms and more economically important species. Concurrently, genome technologies have enabled advances in microbiology, such as disentangling complex communities or, as has been seen in the pandemic, identifying new emerging variants of SARS-CoV-2. These rapid advances have been driven by innovation in high-throughput sequencing technologies and software to assemble and analyse genomes. Recently, step changes in these areas enable the generation of high-quality genomes at scale, making ambitious projects like the Earth Biogenome Project, with the goal of generating genomes for all eukaryotic life, feasible. Furthermore, it means that rather than being limited to a single genome for a species, it is now possible to generate multiple genomes, helping to capture the diversity of the species. However, the scale and complexity of this genomic data presents an analytical challenge and there is a pressing need across the public and private sectors (our stakeholders) for tools, expertise and capacity to translate genomes and long-read technologies into discoveries. The outputs of the Decoding Biodiversity (DECODE) research programme will deliver to this need, to the BBSRC Transformative Technologies theme, and to the government prioritization of investment and innovation in genomics and bioinformatics (UK Innovation Strategy).

DECODE brings together expertise in computational biology, mathematics and genomics. It builds on innovations from our previous core strategic programme "Genomics for Food Security", the cross Institute Strategic Programme (ISP) "Designing Future Wheat", and the Quadram ISP "Gut Microbes and Health". In addition, it draws on the experience and networks gained through the research capacity-building programme "Grow Colombia", and as a partner in the Darwin Tree of Life (DToL) consortium. DECODE is delivered through three interconnected work packages:
Work package 1 will develop tools and techniques to investigate biodiversity. Specifically, this includes developing methods for: comparing multiple genomes within and across species to identify structural changes; using multiple genomes to improve annotation of coding and regulatory regions in the genome; resolving complexity of bacteria communities and biological roles within those communities; the deployment of sequences as real-time sensors of environmental communities. With our partners IBM and Eagle genomics, we will make the software and workflows developed are robust, deployable and scalable.
Work package 2 will use the tools developed in WP1 to investigate biodiversity in publicly available genomes. We will use multiple analytical approaches to: assign function to genomic "dark matter"-genes of currently unknown function; investigate mechanisms underpinning chemical diversity in plants; and identify mechanisms driving genetic diversity in key agricultural crops and aquaculture species.
Work Package 3 will use long read sequencing technologies and the tools developed in WP1 to uncover and explore biodiversity. Specifically, how community structures change over time in increasingly complex systems (the gut, anaerobic digesters and soil) will be investigated. Furthermore, through quantifying gene content changes, WP3 will aim to identify how biological functions change in a community and link these to community health.

To deliver this programme, we have established four key strategic partnership: RBG Kew will provide expertise in plant metabolism, pangenomics and crop wild relatives, IBERS brings expertise in UK orphan crops, the UK Center for Ecology & Hydrology will provide soil samples and access to contextual datasets, and IBM Research will support deployment and scalability of tools.

Technical Summary

This project represents UKCEH's contribution to the delivery of the following Institute Strategic Programme Grant: Decoding Biodiversity (DECODE), BB/X011089/1.

Soil is probably the most diverse microbiome on the planet. It has been estimated that a single gram of soil may harbour up to 50,000 different prokaryotic and eukaryotic species, many of which will be in multiple strains (representing a currently undescribed genetic diversity and functionality). Resolving genomes associated with individual species and strains from metagenomes at this scale, has hitherto proved impossible. We will address this challenge, one that is vital given the critical role these microbes play in making nutrients, such as nitrogen and phosphorous, available to plants. Without a representative collection of soil genomes, we cannot understand how these metabolic processes are partitioned within the community and, hence, the functional relevance of community change, following land management or climate changes.

We will link with ongoing soil health and greenhouse gas flux work at UKCEH to provide a comprehensive soil Metagenome Assembled Genome (MAG) collection, reveal the microbial genomic diversity of soil for the first time, identify coupled taxonomic and functional indicators of land management, and inform researchers as to the best methods for profiling soil microbial communities to inform measurements of soil health. It will provide an improved ability to determine soil health allowing farmers to make better decisions on land management and policymakers to better assess management practices. In order to ensure these outcomes, we will coordinate closely with our partner institution UKCEH through meetings, placements, shared supervision and workshops. In particular, we will coordinate a series of soil metagenome bioinformatics workshops between EI, Rothamsted Research, QIB and UKCEH to develop best practices in methodology, algorithms and analysis.

Publications

10 25 50