De novo Long Noncoding RNAs and the Tumour Microenvironment in High Grade Serous Ovarian Carcinoma

Lead Research Organisation: Queen Mary University of London
Department Name: Barts Cancer Institute

Abstract

Ovarian Cancer (OC) is the 8th most common leading cause of cancer deaths worldwide, with the highest mortality rate and worst prognosis of gynaecological cancers1. High Grade Serous Ovarian Carcinoma (HGSOC) accounts for over 70% of OC deaths. Its 5-year overall survival is less than 30% and ~80% of responders will relapse following primary treatment2. The tumour microenvironment (TME) is vital in extracellular matrix remodeling, which plays a central role in cancer initiation, progression and drug resistance3,4. Long non-protein coding RNAs (lncRNAs) are of increasing importance in tumourigenesis and prognosis and the role of lncRNAs in HGSOC and its TME is being studied5. However, the functional role of lncRNAs is still poorly understood and there remains an unmet need for novel biomarkers and therapeutic targets in HGSOC.
The project aims were to;
1. Establish a robust bioinformatic pipeline to identify and annotate known and novel lncRNAs.
2. Obtain and annotate the top cell compartment specific (tumour vs stroma) disease-associated lncRNAs from a HGSOC discovery dataset and TCGA and ICGC validation datasets.
3. Perform in-vitro validation of candidate lncRNAs.
Methods
RNA-seq data was obtained from metastasised HGSOC omental biopsies (n=35). We performed read alignment and de novo transcript assembly, using the reference human genome, Gencode v29. We filtered lncRNAs using various selection criteria. This involved filtering out:
mono-exonic genes,
genes < 200nt in length,
ensembl coding genes,
ensembl pseudogenes,
de novo transcripts of an unknown coding potential and
ensembl genes that were not annotated as either "lincRNA", "non_coding", "antisense", "3_prime_overlapping_ncrna", "processed_transcript", "miRNA", "misc_RNA", "polymorphic_pseudogene", "processed_pseudogene" or a "pseudogene" biotype.
We obtained the normalised count data and identified the disease-associated lncRNAs, as lncRNAs that significantly correlated with the HGSOC disease score6, using a Partial Least Squares Regression (PLS) model. We obtained the cell compartment specific lncRNAs using the ratio gene expression between tumour and stromal laser capture microdissected samples (n =4). Using the TCGA dataset (n=374), we validated our disease-associated and cell compartment specific lncRNAs. We also obtained lncRNAs whose expression significantly correlated with patient survival. We further validated our diseases-associated cell compartment specific lncRNAs using immune and stromal gene signatures11.

Results
76,322 genes, including 18,252 de novo genes aligned to the reference human genome. We identified 1523 genes as lncRNAs, including 1130 annotated and 393 de novo lncRNAs. We visualised various classes of lncRNAs including intergenic, antisense and putative enhancer de novo lncRNAs, in IGV viewer. 92 annotated and 27 novel lncRNAs significantly correlated (PLS regression cut-off) with HGSOC disease score6. Of these, 79 lncRNAs showed specificity for either tumour or stromal compartments (Log2(stroma/tumour) < |1|). Using the TCGA dataset (n =374), we validated 61 diseases-associated lncRNAs. Of these, 25 lncRNAs showed significant differences in overall survival (p< 0.05; Kaplan-Meir). Examples of these lncRNAs include DANCR, HOXB-AS3 and l_LNC141112 (a de novo lncRNA). Using Estimate's immune and stromal gene signatures11, we further validated 6 significant cell compartment specific lncRNAs (p<0.05, Student's t-test between high vs low expression of Estimate's gene signatures11). Examples of these lncRNAs include LINC00665, l_LNC14112 and MIR22HG.

Publications

10 25 50
 
Description ATHENA Swan Assessment Team for Barts and The London School of Medicine and Dentistry, QMUL (2021 Submission)
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Membership of a guideline committee
 
Description Workforce Diversity: Let's talk about race (by Medical Research Council)
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
URL https://elifesciences.org/inside-elife/313036d9/workforce-diversity-let-s-talk-about-race
 
Description Instagram Takeover for Black History Month 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact Instagram Takeovers on the BCI Instagram account. A day spent posting stories on Instagram about myself, my research and a series of posts focusing on the importance of Black representation in science, followed by Q&A from BCI's Instagram followers. The level of engagement received was great. It was the most questions received for a BCI Instagram takeover, at the time. 17 questions were asked and we managed to answer 12. The majority of the story posts (we did about 30 story posts) had a reach of between 150-200 (reach is the total number of people who see the content), which was great. Received great feedback and acclamation from the Directors office.
Year(s) Of Engagement Activity 2021
URL https://www.instagram.com/qmbci/?hl=en
 
Description Women's Day at Barts Cancer Institute 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact Wrote a Q&A article about myself, my research, my roles and my female inspirations for International Women's day that was uploaded on the Institute's website.
Year(s) Of Engagement Activity 2020
URL https://www.bartscancer.london/general-news/2020/03/international-womens-day-2020/