Clone wars in niche space: exploring the evolutionary and genetic basis for bacterial species.

Lead Research Organisation: University of Exeter
Department Name: Biosciences

Abstract

Bacteria reproduce clonally and lack the homogenizing influence of eukaryotic sex. Nevertheless bacterial genetic variation is not haphazardly distributed. Bacteria commonly form coherent clades with overlapping ecological niches in the same manner as sexual species (Figure 1). Therefore, although key adaptive traits (secondary metabolites, virulence factors etc) are often associated with the accessory genome, genetic variation in essential metabolic genes produces ecologically coherent clades. This leads to questions including: what drives the formation of these distinct genetic clusters? Why does genetic variation in core or essential genes (which are not obviously related to niche) predict ecological association? In this project we will evaluate competing hypotheses (a) that genetic divergence follows neutral accumulation of alleles after selective sweeps and or (b) core genomic variation is shaped by adaptation, either because different allelic variants are favoured in different habitats or because epistatic interactions within the core and accessory genome favour particular allele or gene combinations. In other words, key adaptations to exploit new niches may determine which mutations are favoured in the core genome.
Research will focus on the Bacillus cereus group, which contains bacteria with significant importance for human and animal health (B. cereus, B. anthracis) and bacteria with significant economic importance in pest management (Bacillus thuringiensis). Clades in this group are well characterized, ecologically distinct and can have characteristic thermal biology. Importantly, distinct clades have similar synteny and very similar core genomes. This project will involve bioinformatics, field ecology, phenotype characterization, the application of experimental evolution and genomics approaches, as well as CRIPSR Cas9 genome editing to investigate how distinct niches shape core and accessory genetic variation.
We will directly compare the relative importance of the neutral and adaptive models. In addition to furthering the ecological and population genomic characterization of this group we will use experimental evolution to test adaptive hypotheses directly. We will take replicate isolate pairs from two of the best ecologically characterized clades: clade 2 the mesotolerant insect pathogenic clade and clade 3 the cold-adapted rhizosphere clade and evolve these isolates in conditions simulating their own niche and an atypical niche (plant root derived media or insect hosts) at cold (12 deg C) or warm (32 deg C) temperatures. Under our adaptive hypothesis, strains in novel habitats are predicted to acquire mutations in essential genes that may provide advantages in their new habitats, and that may be correspond to clade-specific alleles in the core genome of the isolates associated with the new niche.

Publications

10 25 50
 
Description In the work funded through this award, we aimed to determine whether genetic divergence is due to neutral evolutionary processes following selective sweeps, or whether adaptation drove divergence by acting on variation in bacterial core genomes. To answer this, we looked at patterns of functional enrichment within the core and accessory genomes of three distinct "clades" within the Bacillus cereus sensu lato bacterial group. We focussed on genes that were likely to have undergone selection (i.e. genes that were very diverse or conserved compared to the core genome average). Each clade was clearly differentiated from each other, with large genetic distances between them and distinct ecological niches. Because different biological functions were enriched within genes under selection when comparing i) different clades and ii) the core and accessory genomes, we interpreted these results to mean that different selective pressures were acting on both the core and accessory genomes of each clade, leading to ecological divergence in both cases, and that adaptation both ancient and ongoing drove the divergence of these clades from a common ancestor.

Having found evidence suggesting adaptation drove divergence by acting on bacterial core genomes, we then aimed to determine if i) this was because different allelic variants are favoured in different habitats or ii) because epistatic interactions within the core and accessory genome favour particular allele or gene combinations. To do this, we had to select a phenotypic trait that varied between different habitats and was linked to a consistent genetic signature. The trait we chose was thermal niche, which has been found to be closely associated with clade within the Bacillus cereus sensu lato group. We experimentally confirmed, through competition experiments, that strains from two different clades differed in their thermal niche; one clade's strains appear to be able to grow well at low temperatures (but not at high temperatures), while strains from the other clade performed best at 30°C but poorly at 15°C. We hereafter refer to these clades as the "psychrotolerant" and "mesophilic" clades respectively.

We aimed to determine whether epistatic interactions favoured particular gene or allele combinations in two different ways. Firstly, we then tested the ability of "psychrotolerant" and "mesophilic" strains to adapt to temperature conditions in which they showed high and low relative fitness compared to the ancestor by subjecting replicate lineages to selection at 15°C and 30°C for over 700 generations. By competing evolved lineages against their ancestors, we expected to find that strains adapted most rapidly to novel thermal conditions; however, we found that strains adapted most rapidly to conditions they already performed well at, with "psychrotolerant" strains showing greater fitness gains at 15°C than at 30°C following selection and "mesophilic" strains showing greater fitness gains at 30°C than at 15°C following selection. This suggested to us that adaptation to novel environmental conditions may be heavily influenced by genomic background through epistatic interactions between "thermal niche" genes and the strain genome.

We aim to confirm the importance of genomic background in determining the ability to adapt to new temperature conditions by conducting a gene "knockout" and "knock-in"; so far, we have deactivated a gene linked to thermal niche in a mesophilic strain to create a "knockout mutant", and complemented with a working copy of the same gene from a psychrotolerant strain to create a "knockin mutant". Interestingly, inactivating the gene in mesophilic strains seems to improve fitness at 30°C and reduce fitness at 15°C, contrary to expectations. This seems to be further evidence that the effect of thermal gene is heavily influenced by interactions with the genomic background. The final experiments to explore this possibility will involve repeating the experimental evolution described above with the knockout and knockin mutant, and determining the fitness changes of the strains involved.
Exploitation Route The outcomes of this funding can be used in two key ways; firstly, it will contribute to the rich literature on defining a species concept in bacteria. Our bioinformatics analysis showed signatures of functional enrichment amongst genes experiencing selection in the Bacillus cereus sensu lato group, and suggest patterns of enrichment that might characterise clades. Classification can be based on differences in the numbers of genes across the genomes from the different GO categories, rather than basing classification on presence of a single gene. By using signatures of functional enrichment to define species, microbiologists and public health workers may find it easier to define useful bacterial species and place strains within them.

The results of the competition and selection experiment also provide insight into the ease with which previously harmless strains may become harmful. If pathogenicity is determined not only by the presence of a given gene within a strain, but by the genome in which the gene is found, it suggests that pathogenicity may not be easily acquired or retained, despite horizontal gene transfer. This work contributes to work attempting to resolve the danger posed by Bt strains used as biopesticide sources, and may resolve the conflict about the pathogenic potential of these strains.
Sectors Agriculture, Food and Drink,Environment,Healthcare

URL https://academic.oup.com/femsec/article-abstract/97/1/fiaa228/5974271
 
Description My project involved a bioinformatics section, in which I developed methods of detecting horizontal gene transfer events and genes under selection within the pan-genomes of large numbers of strains. This experience and methodology was applied during an internship that I conducted as a condition of the funding for this award. I worked for Deep Branch Biotechnology, a start-up biotech company founded by alumni of the University of Nottingham. Their company utilises carbon dioxide as a carbon source to grow bacteria; these bacteria are then used as protein-rich feed for livestock and fisheries. As a technical intern, I was involved with the research and development department of Deep Branch Biotechnology. The primary objective of this internship is to provide a pangenomic analysis of a bacterial genus. The secondary objective was to use data from the initial analysis to correlate microbial physiology to genetic characteristics, and to determine the extent and mode of horizontal gene transfer between the species in the genus and the degree of horizontal gene transfer into the genus. Using my findings and experience from my research, I developed a pipeline to perform a comparison of a large number of bacterial genomes that was able to identify key genetic differences between the strains. The pipeline allowed creation of group pan-genomes, and allowed a user to examine presence and absence of proteins across each strain and link these to phenotypic variation in each strains. The extent of horizontal gene transfer between strains could also be inferred using this pipeline. This work contributes to ongoing strain improvement by Deep Branch Biotechnology. Because these bacteria are then used as protein-rich feed for livestock and fisheries, and sequester carbon that would otherwise be released into the atmosphere, my work will contribute to improving food yields, as well as reducing atmospheric carbon to help combat climate change.
First Year Of Impact 2021
Sector Agriculture, Food and Drink
Impact Types Economic

 
Title Methodology for identifying genes under selection within bacterial clade pan-genomes 
Description Genes under selection show different levels of allelic diversity compared to the pan-genome average. Genes under purifying selection have low allelic diversity compared to the average, while genes under diversifying selection have high allelic diversity compared to the average. Using the output from PIRATE, we used standard deviation of allelic diversity as a cut-off value to identify genes under significant selection within the pan-genome of a bacterial group. This method only requires a Microsoft Spreadsheet and the output from PIRATE, unlike other methods of determining selection strength which require more extensive datasets and programming. 
Type Of Material Biological samples 
Year Produced 2022 
Provided To Others? Yes  
Impact This genomic analysis can help us quickly identify the selection pressures that drive the formation of bacterial groups, and may help the ongoing discussion surrounding how we define bacterial species. 
 
Title Analysis of selection within a pan-genome 
Description We used allelic diversity data from the Bacillus cereus pan-genome dataset previously described to determine strength of selection on different genes by comparing allelic diversity of individual genes to the genome average. Standard deviation was used to identify genes under selection (as 95% of genes values are expected to be within 2 standard deviations under gaussian distribution). Genes with an allelic diversity two standard deviations below the average were considered under purifying selection, while genes with allelic diversity two standard deviations above the average were considered under diversifying selection. 
Type Of Material Data analysis technique 
Year Produced 2019 
Provided To Others? No  
Impact See pan-genome info 
 
Title Bacillus cereus pan-genome 
Description Pan-genome information of 328 Bacillus cereus strains, classified according to MLST clade, including allelic diversity information. 
Type Of Material Data analysis technique 
Year Produced 2019 
Provided To Others? No  
Impact A paper has been submitted to Molecular Ecology which uses this pan-genome information 
 
Title Fitness data of Bacillus strains at different temperatures 
Description Data concerning the growth dynamics of two Bacillus strains at 15c and 30c respectively. Additionally, fitness data of two Bacillus strains, and rifampicin resistant mutants of the same, compared to an ancestral strain at two different temperatures. Additionally, fitness data of twenty evolved lines of Bacillus strains, wildtype and rifampicin resistant. 
Type Of Material Data analysis technique 
Year Produced 2022 
Provided To Others? No  
Impact Too early to say 
 
Title Whole genome assembly of Bacillus mycoides ST353 
Description A whole genome assembly of Bacillus mycoides ST353 (first of it's kind that author knows of). Assembled by combining short read Illumina data with Long Read Nanopore, assembled and quality assessed using QUAST and BUSCO. 
Type Of Material Data analysis technique 
Year Produced 2022 
Provided To Others? No  
Impact Too early to say 
 
Description Collaboration with Micalis Institute to produce cold-shock gene knockout mutants 
Organisation Micalis Institute
Country France 
Sector Public 
PI Contribution As part of my work, I aimed to knock out (deactivate) a cold-shock gene in the Bacillus thuringiensis strain 4D7. We also aimed to knock in (add) a different variant of the same gene back into the knockout mutant. My PI, Professor Ben Raymond and Dr Leyla Slamti of the Micalis Institute have cooperated previously, and we worked together to produce these knockouts. Our contribution to this project was as follows; We acquired a knockout construct that we would use to replace the cold-shock gene in Bt 4D7. Using the plasmid pMAD provided by the Micalis Institute, we created a knockout plasmid (pMAD-AphAIII) that we inserted into a methylating and then non-methylating E.coli in preparation for insertion into Bt. We followed a similar procedure to produce a plasmid (pHT304) containing a new cold-shock gene variant (CspA) to produce a plasmid pHT304-CspA, which was then complemented into methylating and then non-methylating E.coli. We also conducted the final procedure of the Bt strain containing the pMAD-AphAIII plasmid to create a knockout mutant while removing the plasmid.
Collaborator Contribution The Micalis Institute provided the necessary shuttle vectors (pMAD and pHT304). They also conducted the final transformation stage, to move the pMAD-AphAIII plasmid and pHT304-CspA plasmid into the Bt 4D7 strain.
Impact - Production of Bacillus thuringiensis mutants with a deactivated cold-shock gene, and complemented with a novel variant of that gene from a cold-tolerant strain (please see key findings). - Fitness data relating to deactivation of the cold-shock gene (please see key findings).
Start Year 2021
 
Description Collaboration with the University of Bath 
Organisation University of Bath
Department Department of Biology and Biochemistry
Country United Kingdom 
Sector Academic/University 
PI Contribution We aimed to explore how selection works on different parts of bacterial pan-genomes to drive and maintain the emergence of clades. To do this, I worked at the Milner Centre at the University of Bath with Professor Samuel Sheppard's lab. With help from lab members, I conceptualised the method, performed the research on the derived pan-genome, conducted the analyses and was the primary writer for the resulting manuscript.
Collaborator Contribution Dr Sion Bayliss of the Sheppard Lab ran the PIRATE pipeline required to produce a strain pan-genome for me to work on. Professor Sam Sheppard helped conceive the methodology, and allowed me access to the vast quantity of genome data from the Sheppard Lab Multispecies Bacterial Isolate Genome Sequence (BIGS) database needed to run the analysis.
Impact - Production of a group pan-genome for 328 strains (see Key findings) - A new methodology for identifying genes under selection within bacterial groups (see datasets) - Paper in process of editorial review in Molecular Ecology
Start Year 2019