Understanding biological functions of repeat-containing noncoding RNAs

Lead Research Organisation: King's College London
Department Name: Developmental Neurobiology

Abstract

DNA has been traditionally thought to store information about development and function of our body by encoding numerous proteins with distinct enzymatic, regulatory and structural functions. Yet, only ~2% of the human genome is actually used for this purpose by producing RNA messengers subsequently translated into proteins. A large fraction of the remaining 98% - often referred to as genome's "dark matter" - has been recently shown to give rise to relatively long non-protein-coding RNAs (lncRNAs) with poorly understood functions.

Notably, this part of our genome is also enriched in repeated DNA sequences including so-called short tandem repeats (STRs). Expression of a few aberrantly long STR arrays expanded as a result of genetic mutations is known to lead to devastating degenerative diseases by recruiting multiple copies of important RNA-binding proteins (RBPs) to the repeated sequence units. On the other hand, the expression status and possible biological functions of most STRs encoded in healthy human genomes have not been investigated systematically.

We propose to test the hypothesis that STR-containing lncRNAs function as pervasive regulators of cellular RNA metabolism by forming multiple contacts with corresponding RBPs and controlling their activity and cellular localization. We will develop our program by pursuing two interrelated objectives.

In the first objective, we will estimate the number of STR-lncRNAs encoded in the genome and ask how these molecules can contribute to RBP regulation. For this purpose, we will identify STR-lncRNAs expressed in biomedically important samples - including cancer cells and developing neurons - by using an innovative combination of computational and experimental tools. The most interesting examples of the newly discovered STR-lncRNAs will be examined for their RBP interaction properties, localization, and possible role in shaping cellular gene expression.

In the second objective, we will focus on a specific STR-lncRNA called PNCTR that we uncovered in our preliminary work. We already know that PNCTR is over-produced in cancer cells where it focuses a critical RBP, polypyrimidine tract-binding protein (PTBP1), in a dot-like pattern characteristic of high-grade and metastatic tumors. Moreover, PNCTR might be required for cell survival and proliferation. We will continue this line of research by elucidating the mechanisms controlling PNCTR expression and underlying its functional contributions to cancer biology.

All in all, these studies should substantially improve our understanding of the "dark matter" of the human genome, provide fundamental insights into regulation of RNA metabolism, and ultimately open up new avenues in disease diagnosis and treatment.

Technical Summary

Short tandem repeats (STRs) consist of 2-12 nt sequence units concatenated in a head-to-tail manner. Although STRs are widespread in a healthy human genome, their expression status and possible functions of the resultant RNAs have not been investigated systematically. Notably, transcription of aberrantly expanded STR arrays is known to participate in pathogenesis of several degenerative diseases by recruiting numerous copies of RNA-binding proteins (RBPs) with matching interaction site preferences. Here we propose to test the hypothesis that STR-containing long noncoding RNAs (STR-lncRNAs) play a substantially wider role in controlling RBP activity and cellular localization than currently thought. We will pursue two interrelated objectives. First, we will estimate the number of STR-lncRNAs encoded in mammalian genomes and ask how these transcripts contribute to regulation of cellular RNA metabolism. We will identify new STR-lncRNAs using an innovative bioinformatics approach followed by rigorous validation experiments. This will allow us to assemble an STR-lncRNA database and address molecular functions of a subset of these transcripts differentially expressed between normal and cancer cells or regulated in developing neurons. In the second objective, we will focus on the STR-lncRNA PNCTR uncovered in our preliminary work as a multivalent ligand of an important regulator of cellular RNA metabolism, polypyrimidine tract-binding protein (PTBP1). PNCTR is over-expressed in transformed cells where it recruits PTBP1 to the cancer-enriched perinucleolar compartment (PNC). We will continue this line of research by elucidating the mechanisms that control PNCTR expression and understanding functions of this STR-lncRNA in PTBP1 regulation, PNC assembly and cancer biology. Taken together, these studies will delineate biological functions of an emerging class of lncRNAs and provide new insights into regulation of RNA metabolism in biomedically important contexts.

Planned Impact

Training workforce for the UK economy
The proposed research program will provide a framework for professional training of two postdoctoral researchers. They will master standard biology techniques as well as computational and high-throughput experimental approaches - a highly desirable set of skills for modern biomedical research. Both postdocs will additionally acquire advanced communication and managerial skills, as well as experience in supervising students. This comprehensive training will maximize their value as skilled employees capable of making important contributions to the UK academia and industry on the completion of their stints in the PI's lab. Of note, former members of our lab work in a range of sectors from academia and clinics to patent law and regulatory affairs.

Education and public engagement
The program will also allow us to make a lasting impact in the education sector. In addition to supervising KCL students, we will host summer projects of two students from the Judd School, Kent and a sixth-form student from a London state school. The first-hand experience in modern biomedical research will raise students' awareness of different science professions and enhance their educational opportunities. We will also collaborate with the Science Gallery London during their 2019 Dark Matter exhibition season. Together with an emerging artist and young people from the local community we will co-design and produce a small-scale installation exploring the "dark matter" of the human genome. The installation will raise people's awareness that a large part of our genome (e.g. the repeated elements that we will examine in our studies) is still underexplored, and that further research is needed to understand its functions. The Dark Matter season is expected to attract over 100,000 visitors - an excellent platform to enhance the reach and impact of our research. We will also come to the Gallery in our free time to help the visitors interact with the installation. This will provide opportunities for conversation, questions and deeper engagement with the artwork and the scientific problems it represents. All in all, these activities will contribute to scientific education in the UK and make functional genomics a part of a wide cultural landscape.

Medicine and forensics
Our research focus on short tandem repeats (STRs) should additionally deliver long-term impacts in medicine and forensics. Several degenerative diseases associated with considerable morbidity and mortality and a major economic burden (e.g., Amyotrophic Lateral Sclerosis, Myotonic Dystrophy and Fragile X-associated Tremor/Ataxia Syndrome) have been linked with transcription of expanded STRs. Interestingly, STRs also constitute a highly variable fraction of healthy human genomes that is extensively used in forensics for DNA-based testing. By shedding light on STR functions our work should generate valuable insights into molecular etiology of relevant disorders ultimately leading to improved therapies and diagnostic tools. Similarly, discovery of highly abundant STR-containing RNA in our studies may help forensic scientists increase sensitivity and accuracy of molecular identity tests. To maximize the impact of our work in these fields, we will initiate relevant collaborations with clinical and forensic scientists at the KCL and elsewhere in the UK and will publish our results as open-access papers in journals with widest readerships possible. The PI and the postdoctoral researchers involved in the program will also present their work in international meetings attended by both scientists and clinicians. An important deliverable of our program will be development and maintenance of a publicly available online database of experimentally validated STR-lncRNAs. We are convinced that, along with our papers and meeting presentations, this resource will facilitate future medical and forensic innovations.

Publications

10 25 50
 
Description We carried out extensive molecular characterization of the repeat-containing long noncoding RNA (lncRNA) PNCTR identified in our earlier studies. We showed that this RNA is expressed at relatively high levels in many malignant tumors and that it is required for cancer cell viability. The pro-survival function of PNCTR depends, at least in part, on its ability to sequester RNA-binding protein PTBP1 in a nuclear body called perinucleolar compartment (PNC). This in turn controls splicing regulation activity of PTBP1 leading to several changes in cellular gene expression. Interestingly, one important target of PTBP1 controlled though PNCTR/PNC is a pre-mRNA encoding checkpoint kinase CHEK2 involved in regulation of cell survival. Paper describing this part of our work has been recently published in Molecular Cell (https://www.ncbi.nlm.nih.gov/pubmed/30318443) and highlighted in a preview article in the same issue (https://www.ncbi.nlm.nih.gov/pubmed/30388407). Our ongoing studies are focused on understanding molecular mechanisms leading to over-expression of PNCTR in cancer cells and characterization of biological functions of other repeat-containing lncRNAs.

Our interest in repeat-containing transcripts also helped us characterize a new mechanism sustaining molecular identity of pluripotent stem cells. We showed that these cells express large amounts of the RNA-associated factor SRRT, which promotes expression of hundreds of genes by antagonizing premature termination of transcription at cryptic cleavage/polyadenylation sites in first introns. Notably, many such sites appeared in evolution through recurrent integration of retrotransposable repeats into genomic DNA. We published our initial findings in Nature Communications in January 2020 (https://www.nature.com/articles/s41467-019-14204-z) and are currently pursuing further experiments in this direction.

This grant additionally led to development of a robust protocol for differentiation of embryonic stem cells into neurons in vitro. This resource has been instrumental for our collaborative project on understanding nuclear mechanisms regulating expression of distinct isoforms of an important neurotrophic factor (Elife 2021; https://elifesciences.org/articles/65161). Finally, our research interest in noncoding RNAs and RNA-containing compartments facilitated development of Hybridization-Proximity (HyPro) labeling technology described in our 2022 Mol.Cell and STAR Protocols papers (https://doi.org/10.1016/j.molcel.2021.10.009; https://doi.org/10.1016/j.xpro.2022.101139).
Exploitation Route This line of research should provide new insights into pre-mRNA splicing, non-coding RNAs, sub-cellular compartmentalization, and evolution. In the long run, it may improve diagnostics and treatment of cancer and lead to development of novel stem cell-based approaches for research and therapeutic applications.
Sectors Education,Healthcare,Pharmaceuticals and Medical Biotechnology

URL https://devneuro.org/cdn/news-detail.php?NewsID=298&type=93
 
Description Merck contacted us with a request to license a commercial use our newly designed HyPro labeling enzyme. The Merck and King's teams are currently negotiating the conditions.
First Year Of Impact 2022
Sector Education
Impact Types Societal,Economic

 
Description COVID 19 Grant Extension Allocation Kings College London
Amount £3,184,274 (GBP)
Funding ID EP/V520482/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 06/2020 
End 09/2021
 
Title A new computational tool to annotate custom transcriptomes 
Description Together with our collaborators from King's and the University of Tartu in Estonia, we developed an R/Bioconductor package, factR, for functional annotation of custom transcriptomes. 
Type Of Material Physiological assessment or outcome measure 
Year Produced 2022 
Provided To Others? Yes  
Impact This tool is helping our group to perform BBSRC-funded studies. It is also used by other researchers interested in genomics, alternative splicing, and nonsense-mediated decay. 
URL https://bioconductor.org/packages/release/bioc/html/factR.html
 
Title Hybridization-proximity labeleing 
Description We developed a new technology termed Hybridization-Proximity (HyPro) labeling that allows discovery of protein and RNA neighbors of a transcript of interest in genetically unperturbed cells. The method is described in our recently published papers in Mol Cell (https://doi.org/10.1016/j.molcel.2021.10.009) and STAR Protocols (https://doi.org/10.1016/j.xpro.2022.101139). 
Type Of Material Technology assay or reagent 
Year Produced 2022 
Provided To Others? Yes  
Impact We have received several requests from academia and industry to adapt HyPro labeling to a wide range of biomedical projects. 
URL https://doi.org/10.1016/j.molcel.2021.10.009
 
Title Inducible embryonic stem cells 
Description We developed an embryonic stem cell line that can be induced to differentiate into glutamatergic neurons by simple doxycycline treatment. 
Type Of Material Cell line 
Year Produced 2021 
Provided To Others? Yes  
Impact This cell line will be useful for many academic and industrial labs interested in brain development and function. 
URL https://doi.org/10.7554/elife.65161
 
Title HyPro-MS analysis of proteins proximal to nuclear noncoding RNAs in HeLa cells 
Description HyPro-MS dataset 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact This dataset provides a valuable resource for researchers investigating noncoding RNAs, RNA-containing nuclear compartments and RNA-protein interactions. 
URL http://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD025264
 
Title HyPro-seq analysis of HeLa and ARPE-19 cells 
Description HyPro-seq dataset 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact This dataset provides a valuable resource for researchers investigating noncoding RNAs, RNA-containing nuclear compartments and RNA-protein interactions. 
URL https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-10365
 
Title RNA-seq analysis of HeLa cells treated with GapmeR antisense oligonucleotides 
Description RNA-seq dataset 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact This dataset provides a resource for researchers interested in biological functions of nuclear RNA-containing compartments. 
URL https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6529
 
Description Bioinformatics pipeline to analyze gene evolution 
Organisation Francis Crick Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution We developed bioinformatics pipeline to analyze gene evolution.
Collaborator Contribution Our collaborators carried out experiments to dissect the rules for structural adaptation of membrane-associated proteins to evolutionary changes in membrane lipidome.
Impact A paper is published in Current Biology. The collaboration is multidisciplinary since it combines cell biology, evolutionary biology, lipidomics, and bioinformatics.
Start Year 2019
 
Description Developing an AS-NMD analysis pipeline 
Organisation University of Tartu
Country Estonia 
Sector Academic/University 
PI Contribution Participated in development of an R/BioConductor package for discovery of new genes regulated by alternative splicing coupled with nonsense-mediated decay (AS-NMD)
Collaborator Contribution Participated in development of an R/BioConductor package for discovery of new genes regulated by alternative splicing coupled with nonsense-mediated decay (AS-NMD)
Impact A prototype package developed. This multi-disciplinary activity combines our expertise in molecular biology with extensive bioinformatics expertise of the BIIT group, Tartu University (https://biit.cs.ut.ee/).
Start Year 2018
 
Description Gene regulation in neuronal progenitor cells 
Organisation King's College London
Country United Kingdom 
Sector Academic/University 
PI Contribution We helped our colleague at the Centre for Developmental Neurobiology to analyze gene expression in neuronal progenitor cells
Collaborator Contribution Experimental work on neuronal progenitor cells
Impact A paper published in Developmental Cell in 2020
Start Year 2018
 
Description In vitro neuronal differentiation protocol 
Organisation Tallinn University of Technology
Country Estonia 
Sector Academic/University 
PI Contribution We developed a robust protocol for differentiation of embryonic stem cells into neurons in vitro.
Collaborator Contribution Our collaborators carried out a series of experiments to understand nuclear mechanisms regulating expression of distinct isoforms of an important neurotrophic factor, BDNF
Impact A paper published in Elife in 2021
Start Year 2018
 
Description Role of mutant CSRP3 protein in cardiomyopathy 
Organisation University of Oxford
Country United Kingdom 
Sector Academic/University 
PI Contribution We used bioinformatics pipelines developed as a part of our BBSRC-supported research to understand molecular mechanisms linking mutations in the cysteine and glycine rich protein 3 (CSRP3) with hypertrophic cardiomyopathy.
Collaborator Contribution Our collaborators developed a mouse model for hypertrophic cardiomyopathy (HCM) and carried out its extensive phenotypic characterization at the Division of Cardiovascular Medicine, Radcliffe Department of Medicine, Oxford University. Data obtained as a result of these efforts suggest that reduced levels of functional CSPR3 may be a common mechanism underlying HCM.
Impact Results of this collaboration have been published in the following open-access paper: J Mol Cell Cardiol. 2018 Aug;121:287-296. doi: 10.1016/j.yjmcc.2018.07.248. Epub 2018 Jul 23. (https://www.jmmc-online.com/article/S0022-2828(18)30692-8/fulltext)
Start Year 2018
 
Description Bioinformatics training for international students 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact We provided training in bioinformatics to 6 post-graduate students from Estonia and Germany who visited us for 1-2 month each as a part of SZ-TEST exchange program. Although the actual visits were funded by another grant (Marie Sklodowska-Curie Actions Research and Innovation Staff Exchange), we used bioinformatics tools developed as a part of our BBSRC-supported projects for training purposes.
Year(s) Of Engagement Activity 2018,2019,2020
URL https://sztest.eu/
 
Description CDN press release 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact We participated in drafting a press release on our recently published paper in Molecular Cell. The press release was published on CDN web site: https://devneuro.org/cdn/news-detail.php?NewsID=448&type=93
Year(s) Of Engagement Activity 2021
URL https://devneuro.org/cdn/news-detail.php?NewsID=448&type=93
 
Description CDN press release 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact We participated in a drafting a press release highlighting significance of our recently published paper in Nature Communications. The press release was published on the Centre for Developmental Neurobiology website: https://devneuro.org/cdn/news-detail.php?NewsID=352&type=93
Year(s) Of Engagement Activity 2020
URL https://devneuro.org/cdn/news-detail.php?NewsID=352&type=93
 
Description Hosting a Wellcome Trust-supported undergraduate project 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact We hosted a 2-month research project for an undergraduate student from King's College London supported by a Wellcome Trust Biomedical Vacation Scholarship. The activity provided the student with an opportunity to apply for research funding and acquire a set of practical skills in molecular and cellular biology. We are convinced that this experience promoted student's interest in science and expanded his career possibilities in this field.
Year(s) Of Engagement Activity 2018
 
Description Organization of symposium on gene expression in health and disease 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact We co-organized a one-day symposium on "Gene expression in health and disease", an event designed to allow mainly students and postdocs to present their research data to an international audience from Estonia Finland and the UK. Three postdoctoral fellows involved in our BBSRC-supported research have been selected to give 15-min presentations. This is an excellent framework for sharing scientific knowledge, fostering future collaborations and improving presentation skills of young scientists.
Year(s) Of Engagement Activity 2018
 
Description Organizing RNA UK 2020 meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact We co-organized the RNA UK 2020 meeting attended by researchers, post-graduate students, and business/media sponsors. Approximately 160 people from the UK, the Netherlands, Australia, Denmark and Italy attended this 3-day event to discuss recent progress in the RNA field. The event was well received by the RNA community and plans were made to organize the next meeting in 2022.
Year(s) Of Engagement Activity 2020
URL https://www.rnasociety.org/Conferences/rna-uk-2020/
 
Description Participation in "Glow in the dark science" public outreach program in London primary schools 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact This project is a part of the British Science Week event (https://www.britishscienceweek.org/). It introduces primary school children to science in a playful manner. The project is based on activity stations where the pupils learn the basis of how fluorescence works and observe fluorescently labelled fish and fruit flies, among others. Most pupils attending this event clearly enjoyed the program and asked relevant questions. We believe this project should helped the pupils develop interest in life sciences and biomedical medical research.
Year(s) Of Engagement Activity 2016,2017,2018
URL https://devneuro.org/cdn/news-detail.php?NewsID=210&type=91
 
Description Participation in DevNeuro Academy 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Members of the Makeyev lab participated in the DevNeuro Academy project consisting of a regular program of activities designed to improve the progression and success of school students currently under-represented at our university and other institutes of higher education. The project combines a series of four in-school interactive 'Discovery workshops' with a two-week laboratory summer research work experience at the Centre for Developmental Neurobiology, KCL.
Year(s) Of Engagement Activity 2018,2022
URL https://devneuro.org/cdn/public-engagement-dna.php
 
Description Press release article by Molecular Cell highlighting our publication 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact A highlight article showcasing our publication in the same issue of Molecular Cell
Year(s) Of Engagement Activity 2018
URL https://www.ncbi.nlm.nih.gov/pubmed/30388407
 
Description Press release on a KCL website 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact We participated in a drafting a press release highlighting significance of our work. The press release was published on the Centre for Developmental Neurobiology website:
https://devneuro.org/cdn/news-detail.php?NewsID=298&type=93
Year(s) Of Engagement Activity 2018
URL https://devneuro.org/cdn/news-detail.php?NewsID=298&type=93
 
Description School student research internship 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact We organized and hosted a one-week summer research attachment for a student from the Bullers Wood School (Chislehurst BR7 5LJ). The student learned new scientific concepts and experimental techniques. Based on our discussions with the student and the parents this experience appears to have sparked student's interest in biomedical research.
Year(s) Of Engagement Activity 2019,2022