Towards predictive biology: using stress responses in a bacterial pathogen to link molecular state to phenotype

Lead Research Organisation: University of Liverpool
Department Name: Institute of Integrative Biology


A "Holy Grail" in biology is to deduce how an organism will behave under different conditions (its phenotype) from knowledge of its genetic make-up and how its genes are expressed. This is not yet possible, but this proposal will move us towards this goal, using bacteria as a model system.

There are several reasons why we want to be able to do this. First, we want to understand disease-causing bacteria better, so as to protect both ourselves and our food against their harmful effects better than we can do at the moment. Second, we use bacteria a lot in industry and our ability to do this will improve if we can predict in detail how they will behave under industrial conditions. Third, as biology moves towards a more synthetic approach where organisms are engineered to have specific functions, we need to understand how they will survive and thrive in different conditions. This project focusses on bacteria that cause disease, but the methods that we will develop will be applicable in many other situations.

Animals, including humans, have many barriers against bacterial infection, but bacteria are resilient and adaptable and can evade some or all of these, and go on to cause disease. To understand how they are able to do this, we need to understand in much more detail the underlying biology of these organisms under the conditions that exist in our gut. Fortunately, novel methods have been devised that allow us to do this, and this proposal will apply these. For some years, we have been able to make mutations which prevent particular genes from working and use bacteria carrying these mutations to study which genes are needed for survival when bacteria are exposed to stress. We've also known how to study the way in which a particular gene is turned up or down as the external conditions change. But now, it is possible to take a very large mixture of bacteria, containing hundreds of thousands of different mutations, expose all these bacteria to many different stresses, and see how well each mutant survives each stress. This can be done in just a few experiments. We can also study how every single gene in the bacterium is responding to the stress over time, again in a few experiments. Furthermore, we can use this information to construct computer models of how all the genes which respond to the different stresses in the bacteria are connected together. This is like going from a list of addresses in a phone book to a complete map of the streets and houses in a town. The first maps that we construct using this method may not be completely correct, but we can use experiments to check the maps in detail, refining each region until it truly represents what goes on inside the bacterial cell. This is what we will do in this project. We will use the models constructed to make predictions about how bacteria will survive under different conditions, like in a particular part of the gut, and which genes will be important in helping them do this. We will specifically test our ability to make accurate predictions as part of this project. Ultimately, this should help us to predict the vulnerabilities of any pathogenic bacterium, and to use this knowledge to devise novel strategies to protect us from their potentially lethal effects.

Technical Summary

Predicting phenotype from genotype is a long-term goal in biology, and we will use a systems biology approach to do this in a pathogenic strain of E. coli. This proposal will identify key networks needed for E. coli to survive the stresses which it encounters in the gut. Our approach has been validated by our work on acid stress, which found new aspects of this process in E. coli. The unique, powerful feature of this proposal is the use of network-inference strategies on a combination of both gene expression and gene fitness measurements. It addresses several key BBSRC strategic priorities including Animal Health, Healthy and Safe Food, and Systems Approaches to the Biosciences. We will use TraDIS which involves the use of a very high-density transposon library. Such libraries can be used to estimate relative fitness of all the mutants, following exposure to different growth regimes, using HTS to find the level of each mutant before and after growth. This provides a measure of the fitness index for each gene under each condition, which, combined with expression data, will enable the modelling of networks based on functional associations. We will use different stresses, relevant to gut passage, on a library provided by our industrial collaborators, and then use inference to identify critical networks responsive to these stresses. Modules, gene hubs and other topological features will be identified in the model. Mutations in key pathways will be constructed and analysed further. Data from these studies will be used to refine the networks and to enable predictions of phenotype based on gene expression data. Predictions will be tested, and the models iteratively made more robust, by analysis of single gene knockouts and by experiments in an artificial gut system. This approach will be generalisable to any pathogen, and to industrial micro-organisms and organisms produced using synthetic biology methods.

Planned Impact

Our ultimate goal is a truly predictive biology, where the fitness of an organism in a particular environment can be accurately predicted from a detailed knowledge of its molecular state. The ability to do this is relevant to many areas of BBSRC funded research.

The current proposal will move us towards this goal by developing novel computational models, reflecting the structure of the underlying biological networks, and predictive of the phenotypic responses to a range of stresses relevant to the survival of a food-borne pathogen. These models will simulate molecular adaptation and predict fitness in different environments both in the laboratory and in an artificial gut model. All predictions will be experimentally tested.

The methods developed will be highly generalisable, for example to bacteria growing in an industrial fermentation. Our models will allow considerable advances in the understanding of bacterial adaptation to stress, particularly by identifying regulatory circuits that allow survival in different stress conditions. The ability to predictively link pathways, fitness, and phenotype will also be essential in the application of synthetic biology, for example in the construction of organisms with improved ability to perform in unstable conditions.

Our model organism will be E. coli. A known pathogenic strain (UO399, an important multi-drug resistant isolate) will be used, so that the data are representative of a pathogen and not a laboratory-adapted strain. We will use proven computational methods to infer the regulatory networks that enable E. coli to respond to a variety of stresses, including some that it normally encounters in the mammalian gut. The data used for this analysis will be generated under a range of physiologically relevant stress conditions, using approaches based on high throughput sequencing, and will consist of measures of relative fitness for all non-essential genes and gene expression data. This will enable the identification of key regulatory networks and pathways underlying stress survival. The expression profiles that result will be analysed so that specific phenotypic outcomes become predictable from expression data. The hypotheses generated by these models will be tested in the laboratory using a combination of genetics and molecular methods. In addition, the relevance of the system in a complex scenario will be tested by using a validated artificial gut model.

These objectives are relevant to BBSRC strategic priorities in Animal Health, Healthy and Safe Food, and (in particular) Synthetic Biology and Systems Approaches to the Biosciences.

The specific experimental objectives are as follows.

1) Determination of a gene fitness index for each non-essential gene, and acquisition of expression data, for all the genes of UO399. This will be done under a range of conditions representative of stresses that are encountered during passage through the mammalian gut. Stresses will be applied under both aerobic and anaerobic growth conditions.

2) Inference of networks that are critical for survival of individual or combined stresses, by combining models generated from the gene fitness measures and gene expression data, plus integration of data which is already available, and prediction of key genes and pathways in each stress.

3) Testing of predictions from inferred network models, by generation of specific mutants and measurement of fitness by competition experiments under laboratory conditions.

4) Use of the models to make and test predictions about gene fitness in an artificial gut model, based on gene expression data obtained in that model.

5) Testing the extent to which the models and predictions can be generalised across different E. coli strains.

Objectives (2) to (5) will be met through an iterative process of data generation, modelling, prediction, and testing.


10 25 50
Description The Liverpool team has generated a very large dataset of RNAseq and traDIS data under wide range of different stress conditions for the uropathogenic strain of E coli ST131 (Sequencing done in the Centre for Genome Research at the IIB in Liverpool). These include altered pH, different organic acids, bile salts, osmotic stress, oxidative stress, altered carbon sources, all done under both aerobic and anaerobic conditions. This is, at the best of our knowledge the largest compendium of traDIS dataset for E coli.
We have then built a network model of the transcriptomics and fitness data and then shown that the early transcriptional response to stress is predictive of later fitness. Interestingly these models have led to the identification of several putative targets that the Birmingham group (associated grant) are now still testing. Unfortunately, the Birmingham team has encountered considerable difficulties in generating the mutant strains and, because of this the progression of the project has been slowed down.
However, the Liverpool team has refined the computational models and focussed on the analysis of the transcriptional response in relation to fitness. More precisely, we have asked whether, genes that are modulated in the first 15 minutes of the exposure to the stressor are indeed contributing to survival. We have addressed this important question by directly comparing gene expression with the traDIS fitness data.
Interestingly, we have discovered that E. coli ST131 transcriptional response to the stressor can be subdivided in a pro-survival response and an anti-survival response that we can associate to biological pathways that lead to an adverse survival outcome. We are currently preparing a publication outlining these results that we expect to publish in PLOS Computational Biology.
Exploitation Route Our findings represent an important step forward. We expect that academic groups will take on some of the candidates we have identified and investigate their role in stress response both in E coli and other strains at a more detailed level.
We also expect industry to use some of our identified targets for the identification of antibacterial targets.
Sectors Environment,Healthcare

Description So far the computational approach that we have developed has been used as a know-how to develop the basis for an industry project that we are discussing with Auspherix, a UK company based in the Stevenage science park. While the discussion is at an early stage we believe that the potential to develop an industry relevant application are very good. In addition, Agri/food/drink; Healthcare, and Manufacturing are sectors that we believe might benefit from our discoveries.
First Year Of Impact 2017
Sector Agriculture, Food and Drink,Environment,Healthcare,Pharmaceuticals and Medical Biotechnology
Description Auspherix 
Organisation Stevenage Bioscience Catalyst
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution The methodology developed in this project (reverse engineering of bacterial regulatory networks in relation to stress) has allowed Auspherix to identify the potential Mechanism of Action of one of their leading compounds. This is an organo-gold compound with powerful antibacterial properties but with an unknown mechanism of action. We developed an expression profiling dataset representing a dose and time response to the Auspherix antibiotic and use a dynamical modelling technique to identify the hierarchy of events linked to the killing effect of the antibiotic. Our results suggests that the earliest detected transcriptional changes were involving a series of genes involved in oxidative stress response. This is consistent with the observation that bacterial E coli mutants in genes involved in this process show higher sensitivity to the antibiotic than their wild type counterparts.
Collaborator Contribution Auspherix commissioned research to our spin off company Omic Analytics (see specific section of this report). The contribution from Auspherix was in a form of a research contract with the company. This
Impact The collaboration finished in November 2018. The outcome is now at the stage of a working model of the antibiotic mechanism of action. Further research using traDIS methods with different concentrations of the compound has been performed by our Liverpool collaborators (Pete Lund). Work was done by BBSRC-funded student whose project arose directly from the initial BBSRC award. We are now finalising a publication which we expect will be submitted together with the main paper describing the result of the research done within this grant.
Start Year 2017
Description Collaboration with University of Liverpool 
Organisation University of Liverpool
Department Institute of Integrative Biology
Country United Kingdom 
Sector Academic/University 
PI Contribution Provision of experimental samples
Collaborator Contribution Generation of sequence reads for both RNAseq and traDIS, analysis of reads
Impact Collaboration between us (experimental biologists), University of Liverpool sequencing service (high throughput RNA and DNA sequencing), and bioinformatics team (Institute of Integrative Biology).
Start Year 2014
Company Name OmicAnalytics Ltd 
Description The company provide powerful capabilities for complete omics analytical pipelines, including next-generation sequencing and mass spectrometry for biomolecular analysis, through project design, data generation, analytics and knowledge extraction. We also specialise in the design and construction of omics software for data analysis and knowledge management. 
Year Established 2016 
Impact The company has developed software supporting Mass Spectrometry instrumentation. This is currently licensed to the world leading company Waters ( and supports their instrumentation. The company provides contract research support to several other companies. The know-how developed in the associated grants has allowed the company to deliver research contracts in the area of Systems Biology.