Modelling gene networks by non-linear analysis of microarray data.

Lead Research Organisation: University College London
Department Name: Institute of Child Health

Abstract

The biochemical systems that build cells and organisms and keep them working depend on genetic control. The genome is the full set of controlling genes. Traditionally, scientists study individual genes in isolation because it has been too complicated to study them all simultaneously, but genes actually work together as part of interacting networks. Gene activity is controlled by transcription factors / proteins that switch genes on and off. The correct activation of the appropriate genes at the right time is essential for cells to work properly and to combat environmentally induced problems like damaged DNA. Failure to activate the correct combination of genes after DNA damage can lead to cancer. It is now possible to simultaneously measure how active all the genes in an organism are at a particular time using microarrays. Connecting these snapshots of gene activity gives a dynamic picture of how genes respond to cell stress and other signals. Intervention into gene networks could offer the opportunity to modify the response to bring about a better outcome / for example using drugs to change gene activity to enhance the cancer cell sensitivity to a chemotherapy agent. However, at present we are unable to predict how a network will respond to intervention. To make predictions about gene network activity it is not enough to simply know how much of each gene is there. It is also necessary to calculate the activity of the controlling transcription factor, to consider the sensitivity of a target gene to the transcription factor, and to estimate how much of it was there to start with, and how fast it degrades. We have developed a simple mathematical procedure (called HVDM) that combines these parameters to successfully model the network controlled by a single transcription factor (p53). In this new project we wish to develop a new mathematical tool that can identify all the main activities controlling transcription in the DNA damage response using only microarray data, and then model how these factors interact to produce the outcome observed. The finished product will be a computer tool which accurately predicts what will happen to the gene network in any number of possible scenarios. For example, predictions will be made about how the network would respond to a drug which affects one or more transcription factors, and what the effect of this is likely to be on the cells. The advantage of this approach is that it is more widely applicable and more accurate than other ways of analyzing microarrays. It can predict network activity using small datasets where experimental methods would require an impractical number of observations. The implications of this are that researchers will be able to predict gene networks activity much more accurately and efficiently than before. This will make more efficient use of very expensive research resources, and lead to better informed decisions when evaluating potential drug targets for clinical application

Technical Summary

Time course microarray data contains hidden information about transcription factor activity and the sensitivity of target genes. Extraction of this information would allow the quantitative reconstruction of the network in silico, permitting predictions to be made about network behaviour. In this project, we will develop tools based on dynamical equation models which will extract the main activating forces in a complex gene response network. Models will be based on experimental data obtained after activating the DNA damage response network in MOLT4 and B-CLL-B human cell lines using irradiation. Previously we developed a simple linear model, Hidden Variable Dynamic Modelling, which was able to accurately predict targets of a single transcription factor, p53. We now plan to develop new models based on non-linear terms that will describe promoter saturation and threshold effects, co-regulation of transcripts by multiple transcription factors and will take account of enhancer effects, additive regulation and synergism. The new models will be ordinary differential equations with increasingly complex production terms. We will combine these non-linear models with a refined procedure for extracting multiple transcriptional activities from array data to simultaneously predict the behaviour of multiple transcription factors within the system. Each modelling stage will be followed by predictions of system behaviour at either different doses of irradiation, or with key regulators targeted. Predictions will be tested by independent experiments using gene knock-down techniques in MOLT4 and B-CLL-B lines. Discrepancies between model and data will drive model improvement. The end product will be a set of complex but user-friendly tools designed to efficiently and quantitatively analyse time course microarray data and to dissect out the relative contributions of different transcription factors to a biological response.

Publications

10 25 50
 
Description The most significant achievement of this grant was the creation of a predictive and quantitative mathematical model of a complex transcriptional response. The model accurately dissected the response into its component parts, identifying the controlling transcription factors, and identifying which genes were targets of which factors. This was achieved without a preconceived network architecture, using data alone to reveal the controlling activities, and validated experimentally using knockdown technology. This means that the model can be applied in virtually any system where time course measurements of gene expression can be made. It has the potential to allow "in silico" manipulation of the identified parameters to focus experiments, increase efficiency and reduce animal testing. The approach has many potential applications in biology, pharmacology and medicine.
A secondary achievement was the creation of a freely available computational toolkit in a standard, easily applicable format (R- Bioconductor).
Exploitation Route In this project we developed new analysis packages for application to complex transcriptional systems. They have been used both in that context, but also widely as a resource for developing more sophisticated tools that themselves have been applied. This is evident in the citation lists for the papers published from the project, which show application in systems biology and modeling applications - particularly gene regulatory networks. Predominantly it has contributed to the development of better methods in this computational biology field, but has also highlighted the importance of RNA degradation in studying transcriptional responses to stimulae. The software is already in use for both HVDM and GWTM. This of course is fundamental research, and impacts indirectly on the sectors outlined below. The outputs are very much in line with our anticipated beneficiaries list (the application pre-dates a mandatory pathways to impact statement).
Sectors Education,Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description In addition to the academic output listed in the summary section, our work contributed to a greater understanding of the value of mathematics to biology. This was communicated in courses at UCL, to visitors and students to the lab, and we also featured in an online magazine, PlusMaths, which focuses on the importance of mathematics in different research areas (http://plus.maths.org/content/os/issue55/features/barhu/index). The UK desperately needs numerate and mathematically literate graduates to contribute to national productivity and well being. By showing maths applications beyond the expected employment areas, we feel we have helped make maths a more attractive subject to a broader range of potential students.
First Year Of Impact 2009
Sector Education,Healthcare,Pharmaceuticals and Medical Biotechnology
Impact Types Societal

 
Description MSc module Applied Genomics
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
Impact Obviously only limited impact on specific students who have successfully completed the course and now contribute to healthcare decisions with a greater understanding of genomics. This is a growing area, with the prominence of Genomics England, and an increasing dependency on complex genetic analytical approaches in healthcare. Education is therefore a fundamental part of the story. This course contributes to that, and will be the forerunner of wider reaching education schemes targetted at undergraduates and at qualified healthcare professionals and decision makers.
URL http://www.ucl.ac.uk/cell-gene-therapy/Modules/applied-genomics
 
Title GWTM 
Description A method for predicting what controls transcriptional mechanisms without the need to conduct large numbers of experiments. 
Type Of Material Model of mechanisms or symptoms - in vitro 
Year Produced 2010 
Provided To Others? Yes  
Impact New methodology development. 
 
Title rHVDM 
Description Hidden Variable Dynamic Modeling Bioconductor version: Release (3.0) A R implementation of HVDM (Genome Biol 2006, V7(3) R25) Author: Martino Barenco Maintainer: Martino Barenco Citation (from within R, enter citation("rHVDM")): Barenco M (2009). rHVDM: Hidden Variable Dynamic Modeling. R package version 1.32.0. 
Type Of Technology Software 
Year Produced 2009 
Open Source License? Yes  
Impact Frequently used by teh gene regulation modeling community (Systems Biology) 
URL http://www.bioconductor.org/packages/release/bioc/html/rHVDM.html
 
Description Media Interest 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Undergraduate students
Results and Impact Some encouraging feedback from readers of the article and the editor suggesting that it at connected with at least some readers.

Hard to measure. I hope that students realised that mathematics can be applied in many different disciplines post university and considered biological mathematical careers.
Year(s) Of Engagement Activity 2010
URL http://plus.maths.org/content/os/issue55/features/barhu/index