Microbial succession from ice to vegetated soils in response to glacial retreat

Lead Research Organisation: University of Bristol
Department Name: Geographical Sciences


When glaciers retreat, their forefields present a unique opportunity to investigate the initial phases of soil formation and microbial succession. As the ice retreats leaving space for microbial and plant colonisation, some studies show evidence of an increase in a variety of microbial proxies, such as nitrogen fixation, microbial enzymatic activity and diversity, in relation to years of exposure until certain soil stability is reached. Surprisingly, very little is known regarding the genetic and functional diversity of microbes in Arctic habitats. The composition and the metabolic potential of the entire microbial population can be explored by isolating and characterising their genetic material recovered directly from the environment using a metagenomic approach. Each sample of a soil habitat analysed represents a snapshot of the complex mixture of different microbial types and some types will be much more abundant than others. For instance, we predict in this project that genes associated with phototrophic C and N fixation and aerobic C metabolism will be predominant at the initial stages of succession in soil after glacial retreat, while deeper soil samples will provide conditions for anaerobic C and N metabolism to develop, include the production and consumption methane, which is a very powerful greenhouse gas. The metagenomic approach can be further linked to rates of metabolism and geochemical characteristics of soils, many of those factors have strong feedbacks with each other. There have been few integrated studies which link microbial diversity to ecosystem function and the biogeochemical cycling of key elements (C, N, Fe). This proposal aims to employ such integrated approach to generate new and uniquely datasets of genetic and functional diversity of representative terrestrial Arctic habitats. The project will instigate a step jump in our understanding of metabolic pathways of terrestrial Arctic habitats to improve biogeochemical models and quantification of the full metabolic package during successional events in soils after glacial retreat. The forefields of 2 glaciers (one in Svalbard and one in Greenland, which represent one small polar system and one major ice sheet, respectively) will be chosen for this project because they provide a range of forefield habitats of different sizes, locations, vegetation and availability of water surrounding the system. Samples for the metagenomic analyses will be taken from representative soils representing different ages of exposure after glacial retreat. We aim to generate several orders of magnitude more primary sequence data than existing metagenome pipelines were originally designed to deal with. This sampling strategy will give us a high-resolution picture of the microbial genetic and metabolic diversity associated with key elements (e.g., C, N, Fe) of glacial forefield habitats, also allowing us to PREDICT changes in metabolic pathways and biogeochemical cycles in response to glacial retreat. The project will instigate a step jump in our understanding of the biodiversity of glacial Arctic terrestrial habitats and provide a database that may be used to interpret data recovered during future. This will ultimately give us valuable insights in relation to the potential for life in other icy planets and moons and during the so called Snowball Earth. Data generated in this proposal can be incorporated into models of carbon, nitrogen, iron and sulphur cycling.

Planned Impact

Our main objectives are 1) to create unique database of Arctic soil metagenomes that will be publically available towards the end of the project and 2) to produce a short documentary of Arctic research aimed to primary school pupils.

The database will contain a friendly environment to explore the sequences obtained in this project with links to similar projects. In addition to submitting sequence data to the SRA at EMBL, we will add further contextual value by developing customized on-line data interrogation tools to enable researchers to fully mine the rich dataset developed. Further, we will also create a section in the database that links the genetic and metabolic data with modelling of successional events within newly exposed soils after glacial retreat. A further section will contain microscopic images of the autotrophic composition of our samples and another will be addressed to school age children (secondary school). This section will contain stories and descriptions of the main microbial processes in Arctic soils and their significance for climate and the biogeochemistry of these habitats. The Co-PI in the project, Gary Barker, has a strong track record of developing such databases.

We will also create an educational video that engages and entertains primary school to reveal a great deal about how mankind is affecting our environment. We will take advantage from a wealthy amount of footage from our previous project aimed to A-level pupils to add value to the video produced by this project. A geography/science school consultant will also participate in this project to ensure the new film is in line with the national curriculum so it gets as wide a viewing as possible.


10 25 50
Description Glaciers are receding rapidly all over the Northern hemisphere, creating new terrestrial habitats in the form of freshly exposed soils. Those habitats have a long history of being used as model systems to study initial stages of succession, particularly in the Alps, where plant cover develops just few years after the deglaciation. On the other hand, in the High Arctic forefields, plant life establishes itself very slowly and a visible dense plant cover does not develop until after 1000 years of
Exploitation Route This knowledge will be useful for areas such as agriculture (soil formation), astrobiology and climate change.
Sectors Aerospace, Defence and Marine,Agriculture, Food and Drink,Creative Economy,Education,Environment

Description We have demonstrated the principles of microbial colonisation of ice and soils during the Science Museum Lates Event in London in June 2015.
First Year Of Impact 2015
Sector Education
Impact Types Cultural

Description H2020-MSCA-ITN-2015
Amount € 3,897,006 (EUR)
Funding ID 675546 
Organisation European Commission H2020 
Sector Public
Country Belgium
Start 04/2016 
End 03/2020
Title Bioinformatics pipeline 
Description The aim of the bioinformatics pipeline is to provide a reliable tool that assigns taxonomy and gene abundance both at community and taxon level to metagenomics data. The approach relies on assembling all the fastq file reads, DNA short sequences output of the sequencing technology, to create longer sequences (contigs) on which all DNA sequences can be mapped and quantified. We have written several scripts that end users will be able to download from GitHub. The user will be able to run all the scripts on his own server. Whereas step 1 and 2 need to be executed on the user's server, step 3, 4, 5 and 6 have been optimized to be used on the Amazon server as they involve an interactive visualisation and screening of taxonomic and functional profiles at community and taxon level. Therefore, the user can upload the output files of step 1 and 2 and visualise the data on the web interface, programmed with Shiny and JavaScript. Below is a description of the scripts that will be available on GitHub (1,2,3,4,5, and 6) and running on the Amazon server (3,4,5 and 6): 1) script1.pl uses the bwa algorithm to map the fastq file reads from all the samples to the contigs. In the script all the parameters are already set and optimized for this kind of data, but it is possible for the user to change some of those. 2) script2.pl uses a Diamond blastx search to align all the contigs against a non-redundant protein database (UniRef100). The output file reports only proteins that aligned to a contig with bit scores higher and e-value lower than a threshold. All the parameters are pre-set but the user can change them. 3) script3.pl assigns protein coding regions to each contig. It parses the output of script2.pl and looks for non-overlapping proteins on each contig. Whenever more proteins matched the same contig region, only the protein with the highest bit score is kept. 4) script4.pl assigns taxonomy to each contig using the Lowest Common Ancestor (LCA) method. The LCA is determined looking at the taxon associated with each protein that has been assigned to each contig. The LCA assignment is also weighted with the bit score of each alignment. 5) Once taxon (script4.pl) and coding regions (script3.pl) have been assigned to each contig, script5.pl uses this information and the output files of script1.pl to attribute taxon and gene relative abundances to all the samples. The relative abundances are calculated in reads per kilobase (RPK) and are weighted with the proportion of mapped reads in each sample. 6) script6.pl detects SNPs (Single Nucleotide Polymorphisms) in different samples. The output files report synonymous and non-synonymous SNPs, amino acid and codon substitutions, deletions and insertions. 
Type Of Material Biological samples 
Year Produced 2018 
Provided To Others? No  
Impact As soon as the paper is published, this will become available to the community and impact will be then evaluated. 
Title Glacial forefield metagenomes 
Description The metagenomes produced by this project will soon become available for the public. They are currently stored at the Joint Genomic Institute server and MGRast. 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? No  
Impact As soon as the papers are publihed this will become avialble to others and impact valuation can then be made. 
Description Bioinformatics 
Organisation U.S. Department of Energy
Department Joint Genomic Institute
Country United States 
Sector Public 
PI Contribution Metagenomic data.
Collaborator Contribution Bioinformatics service. Assembly of metagenomic data.
Impact On going manuscript preparation.
Start Year 2016