COMPUTATIONAL GENOMICS ANALYSIS AND TRAINING (CGAT)AT THE MRC FUNCTIONAL GENOMICS UNIT

Lead Research Organisation: MRC Functional Genetics Unit

Abstract

CGAT's mission will be to reveal the significance of high-throughput sequence in health and disease, whilst providing training opportunities for pre- and post-doctoral scientists. The Programme will provide access to genome analysis for those UK biologists and clinicians who are less used to managing, manipulating and analysing extremely large sequence data sets. CGAT will take advantage of the analytical expertise readily available both from within the MRC Functional Genomics Unit, and from strong interactions with others in the UK genomics research community. The Strategic Programme will aim both to fulfil the promise of high-throughput genome data in answering the most important questions in genetics and human disease, and to train a new generation of genomics researchers.

Technical Summary

CGAT will provide expertise and analytical capacity to UK-based research groups planning to use or already using next-generation sequencing data. At the same time CGAT will train post-doctoral researchers in the analysis and biological interpretation of genome-scale data sets. Each project will be highly collaborative, and focused on gleaning important biological conclusions from large data sets.

Recent years have seen an accelerating growth in sequencing capacity, whilst sequencing costs have plummeted. As a result, powerful techniques and assays based on DNA sequencing (ChIP-Seq, RNA-Seq, genome resequencing etc.) are being developed that add greatly to the experimental toolkit of biologists and clinicians. However, while becoming increasingly affordable, these new methods require substantial computational infrastructure and expertise that are not available, currently, in many experimental research groups. Indeed, UK-wide there remains a considerable shortage of bioinformaticians able to process these large data sets. CGAT aims to help meet these considerable needs.

For each collaboration, CGAT provides expertise in computational genomics, the computational infrastructure and a dedicated post-doctoral researcher. Collaborators contribute sequence data sets and their unique knowledge of their specific research area. Taken together, the result will be high-quality research culminating in high-profile journals under a joint-authorship model.

CGAT builds on the experience of the Ponting group in comparative genomics and next-generation sequencing analysis. The group has a strong back-ground in computational sequence analysis and evolutionary biology with a proven track record of translating large scale sequencing data into diverse biological contexts. Post-doctoral trainees in the CGAT group will benefit from in-house training, but also the biological insights of collaborators. Working on a variety of projects, trainees will accumulate expertise in genome sequencing-related fields, as well as high profile publications.

Publications

10 25 50
 
Description MRC Molecular Cellular Medicine Board - supplementary funding (CGAT)
Amount £614,100 (GBP)
Funding ID MC_EX_G1000902 
Organisation Medical Research Council (MRC) 
Sector Academic/University
Country United Kingdom
Start 07/2015 
End 03/2016
 
Title Computational genomics pipelines 
Description Software for the analysis of next generation sequencing data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2010 
Provided To Others? Yes  
Impact CGAT members contribute to the development of open source software provided to bioinformatics community at large for the analysis of Next Generation Sequencing data. The code is publicly available from GitHub: https://github.com/CGATOxford 
URL https://github.com/CGATOxford
 
Description Ian Deary analysis 
Organisation University of Edinburgh
Department Psychology
Country United Kingdom 
Sector Academic/University 
PI Contribution Genome-wide association analysis using 549 692 single nucleotide polymorphisms (SNPs) for an investigation of the genetic contribution to individual differences in non-pathological cognitive ageing.
Collaborator Contribution Expertise in cognitive ageing and decline; provision of above data for analysis.
Impact Paper in Molecular Psychiatry (see Publications section)
Start Year 2012
 
Description Proj 001 (Lindsay) 
Organisation University of Bath
Department Department of Pharmacy and Pharmacology
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of microarray and Chip-Seq data. Research question focuses on: lincRNAs in human macrophages.
Collaborator Contribution Expertise in the role of non-coding RNAs in the innate immune response; provision of above data for analysis
Impact Paper: Long non-coding RNAs and enhancer RNAs regulate the lipopolysaccharide-induced inflammatory response in human monocytes. IIott NE, Heward JA, Roux B, Tsitsiou E, Fenwick PS, Lenzi L, Goodhead I, Hertz-Fowler C, Heger A, Hall N, Donnelly LE, Sims D, Lindsay MA. Nat Commun. 2014 Jun 9.
Start Year 2011
 
Description Proj 004 (Leung) 
Organisation Beatson Institute for Cancer Research
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Genome-wide analyses of RNA-Seq datasets. Research question focuses on: Identification of expression changes caused by androgen ablation therapy in prostate cancer by next generation sequencing (mRNA-Seq).
Collaborator Contribution Expertise in basic science and translational research in prostate cancer; provision of above data for analysis
Impact Paper: Next-generation Sequencing of Advanced Prostate Cancer Treated with Androgen-deprivation Therapy. Rajan P, Sudbery IM, Villasevil ME, Mui E, Fleming J, Davis M, Ahmad I, Edwards J, Sansom OJ, Sims D, Ponting CP, Heger A, McMenemin RM, Pedley ID, Leung HY. Eur Urol. 2013 Aug 14.
Start Year 2012
 
Description Proj 004 (Leung) 
Organisation Newcastle upon Tyne Hospitals NHS Foundation Trust
Country United Kingdom 
Sector Public 
PI Contribution Genome-wide analyses of RNA-Seq datasets. Research question focuses on: Identification of expression changes caused by androgen ablation therapy in prostate cancer by next generation sequencing (mRNA-Seq).
Collaborator Contribution Expertise in basic science and translational research in prostate cancer; provision of above data for analysis
Impact Paper: Next-generation Sequencing of Advanced Prostate Cancer Treated with Androgen-deprivation Therapy. Rajan P, Sudbery IM, Villasevil ME, Mui E, Fleming J, Davis M, Ahmad I, Edwards J, Sansom OJ, Sims D, Ponting CP, Heger A, McMenemin RM, Pedley ID, Leung HY. Eur Urol. 2013 Aug 14.
Start Year 2012
 
Description Proj 006 (Ramagopalan) 
Organisation University of Oxford
Department Wellcome Trust Centre for Human Genetics
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of MeDIP-seq and RNA-seq data. Research question focuses on: whole-genome identification and characterization of tissue-specific differentially methylated regions (tDMRs) in two different human immune effector cells.
Collaborator Contribution Expertise in autoimmune diseases, disease genetics, DNA methylation and transcriptional regulation; provision of above data for analysis.
Impact Project closed due to scientific infeasibility
Start Year 2011
 
Description Proj 007 (Klose) 
Organisation University of Oxford
Department Department of Biochemistry
Country United Kingdom 
Sector Academic/University 
PI Contribution Employing our computational expertise and previous experience of transcriptional data to properly and efficiently analyze the vast amount of data generated in this study (research question of study = 'Is the CpG island system conserved in non-mammalian vertebrates?').
Collaborator Contribution In-depth understanding of the CpG island function; sequencing and preliminary analysis of transcriptional data.
Impact PMID: 23467541
Start Year 2011
 
Description Proj 008 (Udalova) 
Organisation Imperial College London
Department Faculty of Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution We are conducting integrative meta-analysis of different types of data (gene expression data, ChIP-Seq, RNA-Seq). Research question focuses on: co-regulation of macrophage inflammatory phenotype by IRF5 and RelA.
Collaborator Contribution Knowledge of IRF transcription factors in establishing phenotype of immune cells; provision of above data for analysis
Impact Paper: IRF5:RelA Interaction Targets Inflammatory Genes in Macrophages. Saliba DG, Heger A, Eames HL, Oikonomopoulos S, Teixeira A, Blazek K, Androulidaki A, Wong D, Goh FG, Weiss M, Byrne A, Pasparakis M, Ragoussis J, Udalova IA. Cell Rep. 2014 Sep 11.
Start Year 2011
 
Description Proj 010 (Tybulewicz) 
Organisation Medical Research Council (MRC)
Department MRC National Institute for Medical Research (NIMR)
Country United Kingdom 
Sector Public 
PI Contribution Design and analysis of RNAseq experiments; mining of existing data relating to lincRNAs, CHIPseq data on B cell transcription factors; interpretation of data leading to design of follow up studies. Research question focuses on: the role that lincRNAs play in the development and activation of B cells.
Collaborator Contribution Extensive experience in the analysis of how mutations perturb B cell development and activation; provision of above data for analysis; experience of performing RNAseq experiments.
Impact Paper: Brazao, T.*, Johnson, J.*, Ponting, C., Heger, A., Tybulewicz, V. "Long non-coding RNAs expressed in naïve and activated B cells"
Start Year 2012
 
Description Proj 011 (Hammond) 
Organisation University of Bristol
Department School of Physiology, Pharmacology and Neuroscience
Country United Kingdom 
Sector Academic/University 
PI Contribution Advising on collection of appropriate sequence data; analysing the data collected. Research question focuses on: mapping genes that cause osteoarthritis in zebrafish using NGS
Collaborator Contribution Unique knowledge of the zebrafish model in studying the genetics of osteoarthritis.
Impact On going awaiting collaborator validation.
Start Year 2012
 
Description Proj 012 (Schofield) 
Organisation University of Oxford
Department Department of Chemistry
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of, and answering specific questions from, large data sets. Questions concern rank order, HIF target genes, and genes targeted by different types of HIF hydroxylase, and whether there is evidence for non-HIF hydroxylase involvement of oxygenases in oxygen dependent HIF regulation. Research question focuses on: analysing the suitability of small molecule hypoxia mimics and the role of oxygenases in the hypoxic response
Collaborator Contribution Expertise on the molecular mechanisms of the hypoxia response in animals; provision of RNA-seq data, HIF1a, HIF2a and HIF1b ChIP-seq data, and HIF siRNA microarray data for analysis.
Impact Paper currently being drafted.
Start Year 2012
 
Description Proj 013 (Drake) 
Organisation University of Edinburgh
Department Centre for Cardiovascular Science
Country United Kingdom 
Sector Academic/University 
PI Contribution Storage, processing and interpretation of large amounts of data from enrichment of 5mC.
Collaborator Contribution Project title: The role of DNA methylation and hydroxymethylation in the programming of neurodevelopment Contributions from collaborators include: dissected brain tissue available from their well-characterised rat model of prenatal glucocorticoid overexposure; enrichment of 5mC and 5hmC performed using affinity based methods.
Impact No findings
Start Year 2013
 
Description Proj 014 (Karadimitris) 
Organisation Imperial College London
Department Centre for Haematology
Country United Kingdom 
Sector Academic/University 
PI Contribution Computational and statistical analysis and visual representation of the data resulting from sequencing of eighteen e4C-Seq and ChiP-e4C-Seq libraries. Research question focuses on: the impact of a disease-causing mutation on the genomic interactome.
Collaborator Contribution Expertise in the epigenetic basis of disease; extensive experience with ChiP assays, chromosome conformation capture-based techniques such as 3C and enhanced 4C-Seq (e4C-seq), lentiviral overexpression and knock down systems, advanced flow-cytometry and sorting, protein biochemistry, cell culture and xenograft models of multiple myeloma.
Impact Awaiting new data
Start Year 2012
 
Description Proj 015 (Minczuk) 
Organisation Medical Research Council (MRC)
Department MRC Mitochondrial Biology Unit
Country United Kingdom 
Sector Public 
PI Contribution Overall: The analysis of raw sequencing data that involves mapping and alignment of mt-RNA-derived reads. Specifically: designing the deep RNASeq approach allowing for precise analysis of mt-RNA; performing the next generation sequencing; analysis and interpretation of large volumes of sequence data
Collaborator Contribution Project title: RNA polyadenylation and the maintenance and expression of the mitochondrial genome Establishing how poly(A) tails regulate mt-RNA abundance and mitochondrial protein synthesis. Identifying novel proteins that play a role in mitochondrial poly(A) tail metabolism. Determining whether poly(U) extensions play a role in RNA surveillance and/or turnover in human mitochondria. Elucidating the mechanism of mtDNA replication via sequencing of the RNAs associated with mitochondrial replication intermediates.
Impact no outcome
Start Year 2013
 
Description Proj 016 (Houlden) 
Organisation University College London
Department Institute of Neurology
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of exome data -- focusing on channel genes and determining if channel variants are causing episodic ataxia. Specifically: Exome analysis in families, in one or more affected individuals and in unaffected cases. Exome analysis of sporadic individuals for de-novo mutations. Investigating if variants in more than one channel gene can cause sporadic episodic ataxia or if these variants modify the clinical picture, age at onset and severity.
Collaborator Contribution Project title: Collaborative project on next generation sequencing in brain channelopathies Episodic ataxia is an inherited disorder characterized by the core features of ataxia, migraine headaches, vertigo and seizures. The main genes known to cause this disorder are CACNA1A and KCNA1. Our partners have identified a number of small kindreds which have at least one affected member with episodic ataxia that are negative for the known channel genes, CACNA1A and KCNA1. They also have a number of sporadic cases with typical early onset episodic ataxia, again these are negative for the known genes and they have examined and imaged the parents. Our partners wish to perform next generation exome sequencing to identify the disease gene(s).
Impact Currently writing manuscript.
Start Year 2013
 
Description Proj 018 (Bundy) 
Organisation Imperial College London
Department Faculty of Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution Data analysis including assembly, annotation, identification of SNPs and indels, and mapping them to genomic features and classification into genic/intergenic and synonymous/non-synonymous. Also exploring ways to mine datasets.
Collaborator Contribution Project title: Understanding adaptation of Pseudomonas aeruginosa to the cystic fibrosis lung environment Our partners have contributed 96 clinical isolates taken from clonal P. aeruginosa infections of individual CF patients over many years. They have extracted Genomic DNA and performed QCs.
Impact Results will be published, but we are currently at the analysis stage.
Start Year 2013
 
Description Proj 020 (Pickard) 
Organisation University of Strathclyde
Department Strathclyde Institute of Pharmacy & Biomedical Sciences
Country United Kingdom 
Sector Academic/University 
PI Contribution Aiding in experimental design; data analysis with regards to normalization and statistical significance of adjusted fusion transcript abundances; discussion of biological interpretation of data. Research question focuses on: a cell-based gene trap to reveal the mechanisms of lithium, a treatment for bipolar disorder.
Collaborator Contribution Expertise in the genetics and biology underlying risk of psychiatric disorders; provision of data for analysis
Impact Project is at the experimental validation stage.
Start Year 2012
 
Description Proj 022 (Brockdorff) 
Organisation University of Oxford
Department Department of Biochemistry
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of H3K27me3 ChIP in Dnmt mutant cells. Analysis of ES cells grown in 2i conditions. Correlation of whole genome bisulfite data and H3K27me3 chip-seq data.
Collaborator Contribution Project title: Polycomb recruitment and DNA methylation PcG ChIP-seq experiments using transcriptional inhibitors and/or knockdown of factors required for establishment of H3K4me3 and H3K36me3 both in wild-type and DNA methylation depleted ES cells. The aim is to assess the role of transcription linked histone modifications in defining PcG redistribution patterns in the presence and absence of DNA methylation.
Impact Paper: Targeting Polycomb to Pericentric Heterochromatin in Embryonic Stem Cells Reveals a Role for H2AK119u1 in PRC2 Recruitment. Cooper S, Dienstbier M, Hassan R, Schermelleh L, Sharif J, Blackledge NP, De Marco V, Elderkin S, Koseki H, Klose R, Heger A, Brockdorff N. Cell Rep. 2014 May 21.
Start Year 2013
 
Description Proj 023 (Rehwinkel) 
Organisation University of Oxford
Department Weatherall Institute of Molecular Medicine (WIMM)
Country United Kingdom 
Sector Public 
PI Contribution Bioinformatic analysis of retroelement sequences.
Collaborator Contribution Project title: Retroelements and Aicardi-Goutières syndrome Generation of Samhd1 null mice. Preliminary data showing that IFN is spontaneously produced in Samhd1 -/- mouse embryonic fibroblasts (MEFs), macrophages, and in spleen tissue. Provision of Trex1 null mice as well as human cells from AGS patients harbouring mutations within these genes.
Impact Data generated unsuccessful
Start Year 2013
 
Description Proj 024 (Fry & Pilz) 
Organisation Cardiff University
Department School of Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of exome data to include: Integration of different data sources: CNVs, exome data from trios (focused on DNMs) and small families (looking at rare inherited variation). Detection of mosaicism. Pathway analysis. Comparison with other brain phenotypes.
Collaborator Contribution Project title: What is the genetic basis of polymicrogyria? Our partners have contributed: Genomic DNA from whole blood (and/or saliva). Extensive phenotype data. cMRI brain scans. NHS gene tests and screening by SNP-array or array-CGH for copy number variants (CNVs).
Impact Manuscript submitted
Start Year 2013
 
Description Proj 025 (Chapman) 
Organisation University of Edinburgh
Department Centre for Cardiovascular Science
Country United Kingdom 
Sector Academic/University 
PI Contribution Library preparation, sequencing and computational/bioinformatic analysis of nucleic acid from gut contents
Collaborator Contribution Project title: GLUCOCORTICOIDS AND GUT IMMUNITY: DOES DEFICIENCY IN 11 -HSD1 ALTER THE GUT MICROBIOME? Contributions from partners: gut microbiome of 11 -HSD1-deficient and control C57BL/6J mice fed standard chow or western diet (0.21% cholesterol, 39% fat; Research Diets, D12079B) for 2 weeks (4 groups, n=6/group). The aim is to establish whether the gut microbiota of 11?-HSD1-deficient mice differs from that of controls. Combined caecal and faecal contents will be collected from mice (24 in total) and DNA extracted using a QIAamp DNA stool mini kit (Qiagen).
Impact Manuscript submitted
Start Year 2013
 
Description Proj 026 (Hemsley) 
Organisation University of Exeter
Department College of Life and Environmental Sciences
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of gene expression data. Pathway analyses. Motif analysis.
Collaborator Contribution Project title: The glycine riboswitch in Burkholderia pseudomallei and its role in virulence Provision of four different conditions, each with two biological replicates, for analysis: A) B. pseudomallei strain K96243 wild type grown in absence of glycine; B) K96243 wild type grown in presence of glycine; C) K96342 ?gcvP (hypervirulent mutant) grown in absence of glycine, and D) K96342 ?gcvP grown in presence of glycine.
Impact Awaiting data validation
Start Year 2013
 
Description Proj 027 (Pennings) 
Organisation University of Edinburgh
Department Centre for Cardiovascular Science
Country United Kingdom 
Sector Academic/University 
PI Contribution Interrogation of NGS mapping datasets in a linear way for micro patterns as well as in a more contextual mode for mega pattern analysis of large chromosomal regions.
Collaborator Contribution Project title: Epigenomic analysis of gene silencing in cardiac development and reprogramming
Impact Awaiting validation
Start Year 2013
 
Description Proj 028 (Wilson) 
Organisation University of Sheffield
Department Department of Molecular Biology and Biotechnology
Country United Kingdom 
Sector Academic/University 
PI Contribution Define the mRNA and non-coding RNA sequences bound by Nxf1, Chtop and Alyref. Examine those sequences for a) consensus motifs b) proximity to exonic splice enhancer sequences recognised by SR proteins and possibly other mRNP binding proteins c) RNA secondary structures.
Collaborator Contribution Project title: RNA export factors and their association with coding and non-coding RNAs Our partners will provide RIP (RNA immunoprecipitation) - CHIP datasets for several mRNA export factors using the agilent mRA array platform; and total and cytoplasmic RNA transcriptome data generated using the agilent human mRNa array platform for cells where individual TREX subunits have been depleted.
Impact Manuscript in preparation
Start Year 2013
 
Description Proj 031 (Vance) 
Organisation University of Oxford
Country United Kingdom 
Sector Academic/University 
PI Contribution Data analysis
Collaborator Contribution Data
Impact Paper: The long non-coding RNA Paupar regulates the expression of both local and distal genes. Vance KW, Sansom SN, Lee S, Chalei V, Kong L, Cooper SE, Oliver PL, Ponting CP. EMBO J. 2014 Feb 18
Start Year 2013
 
Description Proj 032 (Marques) 
Organisation University of Oxford
Department Department of Physiology, Anatomy and Genetics
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of the next-generation sequencing data, including contextualizing that analysis and putting forward candidate transcripts or mechanisms that will then be validated experimentally
Collaborator Contribution Project title: Contribution of miRNAs to tissue-specific pathology in Spinal Motor Atrophy Partner's contributions: transcriptome wide dataset containing small and polyA selected RNAseq data for a well-established mouse model of SMA (smn+/-;SMN2+/-) and litter mate controls (3 biological replicates in each condition)
Impact None -- hypothesis not borne out by data.
Start Year 2013
 
Description Projs 009 & 021 (Hollaender) 
Organisation University of Basel
Department Biozentrum Basel
Country Switzerland 
Sector Academic/University 
PI Contribution Integration of multiple data sources from different biological materials in order to define TEC sub-populations and characterize Aire-independent transcription. Research question focuses on: analysis of the complexity of promiscuous gene expression in thymic epithelial cells.
Collaborator Contribution Expertise in TEC biology; provision of data for analysis (RNA-Seq, miRNA, DNA methylation).
Impact Paper: Population and single cell genomics reveal the Aire-dependency, relief from Polycomb silencing and distribution of self-antigen expression in thymic epithelia. Sansom SN, Shikama N, Zhanybekova S, Nusspaumer G, Macaulay IC, Deadman ME, Heger A, Ponting CP, Holländer GA. 2014 Sep 15.
Start Year 2011