A computer array approach to accelerating the functional prediction of biological systems
Lead Research Organisation:
University of Aberdeen
Department Name: School of Medical Sciences
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Technical Summary
Predicting the systems responsible for controlling biological processes is now possible thanks to the widespread availability of multiple genome sequences, the increased speed and accuracy of proteomic and microarray analyses and the development of novel powerful computer based algorithms. We have pioneered novel bioinformatic approaches that will allow for the prediction of components of the biological systems that contribute to human animal health and microbial pathogenesis. This bioinformatic expertise has generated a number of novel algorithms that allow for the simultaneous analysis of massive genomic and microarray derived data sets for the prediction of enhancer-gene linkage (Starkey, MacKenzie), yeast transcriptional profiling (Brown), the prediction of replication origins (Donaldson, Starkey), the prediction of protein-protein interactions (Ritchie) and the predictive modelling of translation termination ad elongation efficiencies (Stansfield, Starkey). All of the applicants have the expertise to test these predictions in the lab. Because of the nature of the algorithms that we are developing and the number and large size of the genomic, microarray derived data sets to be analysed the conventional desktop computers currently available to us lack sufficient processing power. In order to successfully carry out these analyses we are requesting funds to purchase and maintain a 32 node Dual AMD Opteron Cluster System computer array that will use our unique algorithms to quickly analyse massive data sets and thus speed up the prediction of biological system components by at least two orders of magnitude. Access to the AMD Opteron Cluster System will greatly accelerate our abilities to predict the function of a variety of different biological systems components and increase our knowledge of these biological systems and how these systems may be involved in increasing animal disease susceptibility and microbial pathogenesis.
Publications
Davidson S
(2011)
Differential activity by polymorphic variants of a remote enhancer that supports galanin expression in the hypothalamus and amygdala: implications for obesity, depression and alcoholism.
in Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology
Fowler PA
(2014)
In utero exposure to cigarette smoke dysregulates human fetal ovarian developmental signalling.
in Human reproduction (Oxford, England)
Hay CW
(2014)
Functional effects of polymorphisms on glucocorticoid receptor modulation of human anxiogenic substance-P gene promoter activity in primary amygdala neurones.
in Psychoneuroendocrinology
Hay CW
(2014)
Negative regulation of the androgen receptor gene through a primate-specific androgen response element present in the 5' UTR.
in Hormones & cancer
MacKenzie A
(2013)
Exploring the effects of polymorphisms on cis-regulatory signal transduction response.
in Trends in molecular medicine
Nicoll G
(2012)
Allele-specific differences in activity of a novel cannabinoid receptor 1 (CNR1) gene intronic enhancer in hypothalamus, dorsal root ganglia, and hippocampus.
in The Journal of biological chemistry
Description | This Research Equipment Initiative scheme provided half the funding (£46k) required to purchase a High Performance Computer (HPC) array that allowed high speed whole genome analyses. Using this HPC we were able to use a combination of whole genome comparative genomics in combination with the existing dbSNP database to examine SNP densities across the whole conserved human genome . In this way we were able to show demonstrate the following phenomena,(1)The vast majority of conservation in the human genome is non-coding.(2)SNP densities were reduced in the most conserved regions of the human genome (3)SNP densities were lowest in the intronic and exonic components of the conserved human genome (4)By comparison SNP densities were significantly higher in the conserved intergenic genome despite being conserved to an identical degree.In this way we were able to conclude that (1)the majority of conserved and by extrapolation, functional information is contained in the non-coding genome. (2)The exonic and intronic components of the conserved genome are under identical levels of high purifying selective pressure (3)the conserved intergenic component provides the plasticity required for adaptive evolution and may also be the major reservoir of disease causing polymorphisms.These observations have since been validated by GWAS studies showing that 88% of disease causing SNPs occur in non-coding regions of the genome. The results of this study were published in BMC genomics [1].In addition to this study the HPC has been used to host an on line web site called RegSNP (http://viis.abdn.ac.uk/regsnp/Home.aspx) that allows researchers to predict the effects of different alleles of SNPs on transcription factor binding sites. This facility also allows the prediction of SNP in linkage disequilibrium and is currently being updated to allow the detection of LD of GWAS associated SNPs with SNPs in highly conserved regions.In addition, Starkey has used the HPC in work that undertakes data mining of a multi-parameter model landscape through the use of Monte Carlo methods allied to statistical techniques focused on the global Nitrogen model [3,4]. [1] Davidson, S., Starkey, A. & MacKenzie, A. (2009). 'Evidence of uneven selective pressure on different subsets of the conserved human genome: implications for the significance of intronic and intergenic DNA'. BMC Genomics, pp. 10:614; [2] Davidson, S., Starkey, A., MacKenzie, RegSNP - Predicting Allele Specific Differences in Transcription Factor - DNA binding. http://viis.abdn.ac.uk/regsnp/Home.aspx; [3] Starkey, A, Robinson D. "Monte Carlo simulation of the global nitrogen cycle", CSC'09;[4] (submitted for publication) David Robinson, Calum Burgoyne and Andrew Starkey (2012). Nitrogen fluxes in a steady-state global nitrogen cycle |
Exploitation Route | Many people either gain little therapeutic benefit from currently available drug therapies or suffer serious side effects. Identification of drug response stratification loci will greatly accelerate the delivery of novel drugs to market by allowing more selection of drug test patient cohorts during their development and by providing an avenue to more focussed prescribing to patients who would benefit after their market delivery thus delivering on many of the promises of personalised medicine. This ability will prove hugely profitable to the drugs industry whilst greatly improving patient care.. We are currently exploring methods of using the techniques developed using our high performance computer array to identify the polymorphisms that contribute to drug response stratification in the human genome |
Sectors | Healthcare Pharmaceuticals and Medical Biotechnology |
Title | RegSNP |
Description | RegSNP is a new computer algorithm for the prediction of the effects of SNPs on transcription factor binding to DNA. The algorithm was released on line in the form of an easily usable web site. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2009 |
Provided To Others? | Yes |
Impact | The use of RegSNP allowed us to predict the effects of SNPs identified within the regulatory regions under study. We are also aware of the use of this web site by other researchers overseas. |
Title | Reporter gene transgenic lines |
Description | The material consists of a number of trangenic lines that contain reporter gene constructs made with enhancers identified during the time frame of the current projecty |
Type Of Material | Model of mechanisms or symptoms - mammalian in vivo |
Year Produced | 2008 |
Provided To Others? | Yes |
Impact | The material provided allow for the assessment of the tissue specific and inducible properties of enhancers identified by comparitive genomics in vivo. These novel models allow for the production of persuasive in vivo data on the properties of novel enhancers. The regulatory regions currently being modelled in these lines include enhancers and promoters for Galanin, TAC1, CGRP, NPY, BDNF and CNR1. |
Title | Transgenic model |
Description | Transgenic model of ECR2-TAC1prom-LacZ. Currently stored as frozen embryo |
Type Of Material | Model of mechanisms or symptoms - non-mammalian in vivo |
Year Produced | 2008 |
Provided To Others? | Yes |
Impact | These transgenic lines are now frozen in N2 but have contributed critical data to 4 different research articles in good journals. They are available to the research community as frozen embryos. |
Title | transgenic Mouse |
Description | Mouse transgenic for GAL5.1-LacZ construct |
Type Of Material | Model of mechanisms or symptoms - mammalian in vivo |
Year Produced | 2010 |
Provided To Others? | Yes |
Impact | Publication of the LacZ expression patterns produced in the brain of this mouse has generated tremendous interest in the GAL5.1 enhancer sequence. |
Description | Development of a novel Algorithm to define SNP effects on DNA binding sites |
Organisation | University of Aberdeen |
Department | School of Engineering |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We have provided intellectual input and have guided the project from the biological perspective. |
Collaborator Contribution | Dr Starkey has brought expertise in Computer software engineering. |
Impact | We have one paper in press (BMC genomics) and one to be submitted very soon in BMC computational biology that describes the development of a novel web site that allows researchers to predict the effects of non-coding polymorphisms on the binding of transcription factors |
Start Year | 2006 |
Description | GWA Studies, Gene Regulatory Variation and Disease |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | Seminar/workshop no actual impacts realised to date |
Year(s) Of Engagement Activity | 2011 |
Description | Gene regulation, SNPs and disease |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | Many interesting questions were asked and views shared I hope to have stimulated young people in the audience to pursue science as a career |
Year(s) Of Engagement Activity | 2011,2012 |
Description | Gene regulatory mechanisms and Chronic Pain |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Type Of Presentation | Keynote/Invited Speaker |
Geographic Reach | International |
Primary Audience | Participants in your research or patient groups |
Results and Impact | Seminar/workshop Invited seminar at the University of Strathclyde, Glasgow no actual impacts realised to date |
Year(s) Of Engagement Activity | 2008 |
Description | Gene regulatory mechanisms and Inflammatory Pain |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Type Of Presentation | Keynote/Invited Speaker |
Geographic Reach | local |
Primary Audience | Participants in your research or patient groups |
Results and Impact | Seminar/workshop Invited seminar at the University of Liverpool no actual impacts realised to date |
Year(s) Of Engagement Activity | 2009 |
Description | Gene regulatory variation in health and disease |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Postgraduate students |
Results and Impact | Seminar/workshop The talk stimulated a great deal of interest from the audience |
Year(s) Of Engagement Activity | 2011 |
Description | RegSNP - Predicting Allele Specific Differences in Transcription Factor - DNA binding |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This website permits non-expert in biotechnology to identify the transcription factor bindingsites most affected by specific SNPs. The website also displays LD data, information on disease data and whether a particular SNP is GWAS associated or alters suceptibility to epigenetic modifiation. Our web site has informed the research of many other researchers. |
Year(s) Of Engagement Activity | 2009,2010,2011,2012,2013,2014 |
URL | http://viis.abdn.ac.uk/regsnp/Home.aspx |