Developing methods for inferring regulatory mechanisms from intact systems: a neisseria case-study.

Lead Research Organisation: Brunel University London
Department Name: Sch of Health Sciences and Social Care

Abstract

The behaviour of biological systems is controlled and coordinated through a network of 'regulators' and other intracellular interactions that control the expression of the genes within the cell. In bacteria the production of the messenger RNA and the production of proteins are closely linked, and much of the way in which a cell's behaviour is controlled is done at the level of transcription. Transcription can be measured for all of the genes in a cell simultaneously, using microarrays, and this gives a relatively direct read-out of the way in which many aspects of the cell's behaviour are being controlled. This method gives a 'snap shot' of the transcribed genes, and when the observations from many snap shots are combined, the way in which the cell controls its functions can be progressively pieced together, much as the meaning of a movie can be pieced together from the combination of multiple 'frames'. Finding ways to use this information to make testable models that can be used to dissect these central processes that control biological systems is a critical component of systems biology and understanding how biological systems work at a fundamental and 'whole cell' level. If causal relationship and key interactions controlling a cell's behaviour can be determined based upon this type of 'observational' information, then this means that these systems can be addressed without the (frequently impossible, impractical, or unaffordable) need to address each gene individually. Many genes are required for a cell to survive. Other genes are not required for life, but the resulting cell does not function 'normally' in several ways when a normal component has been removed / and it is very difficult to tell which effects are directly or indirectly due to the effects of a gene / gene product. To understand how cellular systems work, we propose that we need ways to analyze and use the information from 'intact / unbroken' biological systems. In this proposal we will make use of one of the largest collections of 'transcript' data, and augment this with information specifically designed to assist modeling the ways in which the cell is controlled. The effectiveness of this modeling will be tested, and the models will be augmented and refined by addressing the key genes by making mutants and testing to what extent they behave according to the model predictions. In this way, we will develop a generally applicable approach that can be applied generally, without the need for expensive, time consuming, and potentially misleading mutant generation in the future.

Technical Summary

We have a unique collection of existing microarray data from which to address the regulatory networks of a bacterial system. This collection of data, already with more than 300 channels of high quality, semi-quantitative data, non-dye-incorporation biased data, which we have validated as both dual-channel and single-channel data, will form the basis for initial modeling. In addition, a model of the interactions between about 30% of the main neisserial regulators has already been developed, from the analysis of classical direct comparisons of mutant and wild-type strains, which will be used to test model predictions and form a basis for building a model of regulatory networks. The mathematical analysis will involved two complementary approaches which are applicable to de novo and hypothesis/model driven analysis or transcriptional regulation, respectively. We will use graphical models, building on existing work in graphical Gaussian models and (dynamical) Bayesian Network models to infer which interactions are likely to occur. This work will be supplemented by analysis of mechanistic models. Both approaches will yield testable predictions for regulatory relationships. Model predictions will be tested addressing key genes at 'causal nodes' and those with reciprocal relationships, suggestive of negative feedback, using deletion mutants. These will be generated, using context appropriate cassettes, and they will be expression profiled, using the same high-quality data generating methods, to both test and validate the modeling approaches, and to extend the datasets used for the predictive modeling. Ultimately, the modeling will be built upon over 1000 channels of expression data, and will provide a real and challenging test of this experimental and analytical approach to the analysis of biological systems.

Publications

10 25 50
 
Description A number of fundamental discoveries were made that have re-defined the way in which we now pursue behavioural determinants. The extent to which behaviours result from repeated changes, exchanged genetic components, and convergent functional evolution has taught us how it is possible to mine genetic information as genome sequences and also link this to expression profiles and other measurable behaviours.
This underpins subsequent research which has now been further developed and translated.
Most of these discoveries were from work informed by the FAILURE of the initial research strategy. Originally this strategy was to use Bayesian modelling on large existing and specifically added expression datasets, that were supposed to be provided through the course of the research and be used to direct other experiments. This analysis (from a collaborator) was not provided until near the end of the project, and when it was provided was not 'biologically sensible'. Because of this alterative experimental strategies were developed and applied to achieve the original objective - which was to find ways to identify behavioural determinants (specifically regulatory differences in these studies) working from naturally occuring intact systems, rather than 'mutants' (what we called 'unbroken systems biology). This was achieved, and has formed the foundation of subsequent productive and useful research (though currently unpublished, due to it's commercial nature).
Exploitation Route These methods form the basis of generic methods to identify behavioural determinants in bacteria directly from naturally occurring strains, genome sequences, and identiable behaviours. They have been further developed subsequently (through EU and University funding) through translationally-focussed collaborative projects addressing exploitable bioPart discovery for synthetic biology.
This work is in the process of being made available through a specific program of collaborative tool development - applying the methods to exploitable bacterial species and those important for antibiotic resistance. This is through the development of SynbiSTRAIN collections of bacteria and associated exploitation informatics, through SynbiCITE and with a range of academic and industrial partners.
Sectors Energy,Environment,Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description The work done in this project was fundamental and has formed the basis of new methodologies and concepts that are being used in industrial collaborative projects with a number of partners. These include partners in EU programs, including CARTIF in Spain, Vogelbusch in Austria, and the Technical University of Munich and others through the Valor Plus and SupraBio programs. In addition, there are also collaborations with other companies in the UK and one in the US. (currently covered by non-disclosure agreements).
First Year Of Impact 2014
Sector Chemicals,Energy,Environment
Impact Types Economic

 
Description EU-FP7 Valor Plus
Amount € 860,000 (EUR)
Funding ID 613802 
Organisation European Research Council (ERC) 
Sector Public
Country Belgium
Start 12/2014 
End 02/2017
 
Title Comparative Behavioural Genomics (CBG) 
Description Following critical insights into the ability to link genome sequence data to expression behaviours, its stability, and repeated and convergent properties, a GWAS-like method was founded which has subsequently been developed to address genes, and gene variants (over the following 2 years). We are now able to mine evolved natural diversity for behavioural determinants without the experimental challenges and costs of making mutants, and without the associated system-level disruptions. This ability to interrogate 'unbroken' systems using functional genomics was the underpinning objective of the work that was funded. A key component of this was the insight and development of integration of TWO types of data: genomic and behavioural (in the original project, transcriptomic); and that the statistical approaches and other attempts are inadequate when dealing with only one level of system information. 
Type Of Material Improvements to research infrastructure 
Year Produced 2014 
Provided To Others? Yes  
Impact This work is ongoing, and are either at preliminary stages or in which partner agreements to-date prevent publication. Discoveries so far include: New antibiotic resistance genes and markers Identification of feedstock-inhibitor and product tolerance determinants in industrial strain development programs Identification of carbon source and waste product feedstock use in industrial strain development. The discoveries in this project work underpined the development of the data analysis and bioPart discovery aspects of a national resource of characterized strains of species of explotiotation relevance to synthetic-biology, through SynbiCITE, which is being contributed to and will be available to around 20 UK Universities, and 25 UK SMEs in this area (as well as larger partners). And, it is anticipated that this will be made openly avaialable by the end of 2016. 
 
Company Name SYNGENIOUS LIMITED 
Description This company makes use of whole genome sequences, comparative genomics, evolutionary biology, and unique strain collections developed for the purpose of providing parts, chassis strains, and design information for translational microbiology using predominantly bacteria for primarily synthetic biology translational applications. 
Year Established 2017 
Impact This is an early start-up, it's principle impacts currently are the creation and making available to academic and industrial communities the strain collections and analytical resources. Currently the company is in the process of refining resources and (through the academic routes) producing publications and materials to make the wider communities aware of its existence and to promote the resources and their capabilities.