Revealing the hidden coding potential and dynamics of the mammalian transcriptome

Lead Research Organisation: Babraham Institute
Department Name: Immunology

Abstract

In recent years transcriptome sequencing (RNA Seq) has become a widely used technique in both fundamental and applied biological research. The Turner Lab at the world-class Babraham Institute studies the molecular processes that control development and function of lymphocytes. The lab uses the latest RNA Seq-based methods for exploring transcriptional regulation. This work contributes an important step towards a systems level understanding of immunity. The Turner Lab has partnered with Babraham-based Eagle Genomics Ltd., a leading bioinformatics cloud consultancy business, in an exciting PhD training opportunity due to begin in October 2014 (or potentially later in the 2014/15 academic year).

The student will develop computational approaches to extract meaning from quantitatively and qualitatively different data sets relevant to gene expression. Additionally, the student will examine sequences of ribosome associated mRNAs for evidence of novel transcripts translated in lymphocytes. For the most part this data will be derived from RNA Seq but other data such as publically available measurements of proteins (proteomics) and metabolites (metabolomics) may also be used to provide supporting evidence. The project will involve the generation of new data by the student, as well as the use of pre-existing data (or data to be generated in parallel projects in the Turner lab or by collaborators) and publically available data. Thus the student will receive practical training in the growth and stimulation of primary lymphocytes and the generation of RNAseq libraries which provide information of RNA dynamics.
The student will also participate in the iterative process of developing data analysis pipelines. In particular they would take approaches to modelling of experiments that integrate heterogeneous 'omics data which is a particular interest of the Eagle. Effective communication of results and analytical methods will be key to the success of this project and we propose to make use of publication formats that link standard manuscript publication with a database hosting all associated data, data analysis tools and cloud-computing resources.
Applicants with a background in computer science and with an interest in how gene expression is regulated are particularly encouraged to apply.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
BB/M017141/1 01/10/2015 30/09/2019
1642216 Studentship BB/M017141/1 01/10/2015 30/09/2019
 
Description Identification of the coding elements in the genome is a fundamental step in understanding the building blocks of living system. Micropeptides of 100 amino acid residues and fewer have emerged as important regulators of development and physiology. Small open read frames or smORFs of 100 codons and fewer are potentially translatable into micropeptides. We have built a computational pipeline to identify translated smORFs in the lymphocytes using sequencing data. Thousands of smORFs were predicted and are being analysed.

We have so far predicted 5574 actively translated smORFs in the lymphocytes. A subset of them were predicted to have signal peptides, thus they have the potential to be secreted.
Exploitation Route 1. Our computational pipeline will be open source to the bioinformatics community to analysis sequencing data. 2. The micropeptides predicted potentially have therapeutic applications.
Sectors Digital/Communication/Information Technologies (including Software),Pharmaceuticals and Medical Biotechnology