LongTREC: The Long-Reads Transcriptomics European Consortium. The next generation transcriptome biology revealed by single molecule seq. technologies
Lead Research Organisation:
Earlham Institute
Department Name: Research Faculty
Abstract
Third-generation, single-molecule long-read sequencing technologies such as Nanopore and Pacbio are revolutionizing transcriptome research and are expected to outcompete short-reads sequencing for gene expression studies. However, technical and bioinformatics challenges exist to the realization of this full potential. In this project, we will train a team of excellent Early Stage Researchers to become the European and world leaders in the field of long reads transcriptome sequencing (lrRNA-seq). The team will address three fundamental aspects of current barriers to the progress that these technologies face. a) Obtaining high-quality sequencing and preprocessed data from improved library preparation protocols and bioinformatics preprocessing methods, b) Establishment of benchmarked analysis pipelines for the application of these approaches to genome annotation and isoform usage analysis both in bulk and single-cell samples, and c) Extension of long reads to multi-omics applications for the study of regulatory biology. Additionally, ERS will follow an intensive and comprehensive training program composed of scientific and soft skills training, including activities related to pressing societal challenges such as climate change, diversity and inclusion, and the ethics of science. Students will complement their education through secondments in top European laboratories and Companies as well as through exposure to a network of worldwide experts in the field of RNA research.
Publications
Etherington G
(2023)
Schizosaccharomyces versatilis represents a distinct evolutionary lineage
Mincarelli L
(2023)
Single-cell gene and isoform expression analysis reveals signatures of ageing in haematopoietic stem and progenitor cells
in Communications Biology
Shen A
(2023)
U6 snRNA m6A modification is required for accurate and efficient cis- and trans-splicing of C. elegans mRNAs.
in bioRxiv : the preprint server for biology
| Description | The work as part of this award has led to the creation of two computational pipelines to enable users of long read single cell RNA-Seq 1. A computational pipeline for long read single cell quality assessment. The pipeline developed in collaboration with researchers at the university of Valencia, enables the users to assess single cell long read transcriptomic data reporting metrics and graphical visualisation associated with the data. The pipeline is being incorporated as a computational module to the widely accepted software SQANTI 2. A computational pipeline for the processing and analysis of PACBIO long read single cell RNA-Seq data, enabling data filtering based on cell barcodes and unique molecular identifiers (UMIs), de novo transcriptome annotation based on the sequencing data, transcriptome filtering, quantification including generation of input files for Seurat |
| Exploitation Route | The computational developments as part of this award will have significant implications for users of single cell long read data as the pipelines are being develop to enable assessment of the data, characterisation of the transcripts arising from the system investigated and quantification of expression. The pipelines have been built to be technology and system agnostic to facilitate the adoption of those developments. |
| Sectors | Pharmaceuticals and Medical Biotechnology |
| Description | Nearly every gene in human are undergoing alternative splicing, the process by which different transcripts arise from a single gene. Until recently short read technologies have been the main approach to quantify gene expression, and reconstruct transcripts. This latter process has been relying on probabilistic models leading to a significant proportion of false positives. Long read sequencing now enable to capture the full length transcript enabling greater accuracy in their annotation but also in their quantification. The work as part of this proposal has been enabling engagement with industrial partners (pharma) to identify and characterise transcripts expressed in a tissue specific manner facilitating therapeutic target definition, but also in assessing the functional consequences of small molecules targetting exon inclusion. |
| First Year Of Impact | 2024 |
| Sector | Pharmaceuticals and Medical Biotechnology |
| Impact Types | Economic |
| Description | Astrazeneca |
| Organisation | AstraZeneca |
| Department | Research and Development AstraZeneca |
| Country | United Kingdom |
| Sector | Private |
| PI Contribution | Analysis of long read bulk RNA-Seq for the detection of splicing variation in human cell lines under different conditions |
| Collaborator Contribution | Experimental data (cell exposed to different compounds), sequencing data |
| Impact | Multidisciplinary collaboration combining molecular biologists and computational biologists |
| Start Year | 2023 |
| Description | Collaboration - JS UEA |
| Organisation | University of East Anglia |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | Collaboration on the investigation of the implications of small RNAs during cardiomyocytes differentiation and implication of disruption We are analysing long read data from the iPSC, early differentiation, late differentiation |
| Collaborator Contribution | The collaborators provided cells for preparation of sequencing libraries and proteomic data |
| Impact | the collaboration is multidsiciplinary bringing together molecular biologists, cellular biologists, bioinformaticians |
| Start Year | 2024 |
| Description | Collaboration King College London |
| Organisation | King's College London |
| Department | Maurice Wohl Clinical Neuroscience Institute |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | Computational analyses of long read bulk and single cell data from human IPSCs with engineered mutation for Amyotrophic lateral sclerosis and derived neurons, bulk long reads RNA-Seq from mouse brains with human mutations for Amyotrophic lateral sclerosis |
| Collaborator Contribution | Generation of IPSC lines with causative mutations, generation of the mice lines with human mutation |
| Impact | The work is highly collaborative including clinicians, clinical geneticists, molecular biologists, cellular biologists, computational biologists |
| Start Year | 2023 |
| Description | Collaboration Nucleic Acid Therapy Accelerator |
| Organisation | Medical Research Council (MRC) |
| Department | Nucleic Acids Therapeutics Accelerator (NATA) |
| Country | United Kingdom |
| Sector | Public |
| PI Contribution | NATA will raise oligo against isoforms of potential therapeutic value to enable follow up validation experiments |
| Collaborator Contribution | NATA will provide means to knock out and knockdown isoforms of interest, we will be providing a list of isoforms of potential therapeutic value |
| Impact | The work is highly multidisciplinary combining molecular biology, cell biology, and bioinformatics |
| Start Year | 2024 |
| Description | Collaboration UEA Medical School |
| Organisation | University of East Anglia |
| Department | School of Medicine UEA |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | Computational analyses of long read data to assess polyAdenylation sites usage during cardiomyocytes maturation, both in wild type background and in cell lines with mutations for cardiac disorders |
| Collaborator Contribution | IPSC generation, cardiomyocytes maturation |
| Impact | Computational analyses of differential polyadenylation usage during cardiomyocytes maturation |
| Start Year | 2024 |
| Description | Wellcome Trust Sanger Institute |
| Organisation | The Wellcome Trust Sanger Institute |
| Country | United Kingdom |
| Sector | Charity/Non Profit |
| PI Contribution | Computational analyses of single cell long read data (PacBio, ONT) for human IPSC and differentiated neuronal lineages (astrocytes, motor neurons) |
| Collaborator Contribution | Cells (IPSCs, derived neuronal lineages), sequencing data |
| Impact | The work is multidisciplinary bringing together cellular biologists, molecular biologists, and computational biologists |
| Start Year | 2023 |
| Title | scSQANTI |
| Description | https://github.com/ConesaLab/scSQANTI_devel |
| Type Of Technology | Webtool/Application |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | These developments enable users to assess data quality from single cell long read sequencing |
| URL | https://github.com/ConesaLab/scSQANTI_devel |
| Description | Invited seminar - Institut Pasteur - Characterisation of gene and transcript regulation in the brain across scales |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Invited speaker at the annual single cell symposium at the Institut Pasteur |
| Year(s) Of Engagement Activity | 2024 |
| Description | Invited seminar - Stockholm - What Your Are Missing Matters - Characterisation of novel splicing events across brain tissues and during development |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Invited talk at a Oxford Nanopore Technology Event |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://nanoporetech.com/about/events/nanopore-days/nanopore-day-stockholm-2024#event-overview |
| Description | Invited seminar at Astrazeneca Gothenburg - Characterisation of gene and transcript regulation in the brain across scales |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Industry/Business |
| Results and Impact | Invited seminar at Astrazeneca (Gothenburg, Sweden), to present and discuss the power of long read RNA-Seq to enable the identification of therapeutic targets |
| Year(s) Of Engagement Activity | 2024 |
| Description | PacBio webinar - Characterisation of transcript regulation in iPSCs and derived neuronal lineages |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | Regional |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation at a webinar organised by PacBio on the use of novel libraries prep |
| Year(s) Of Engagement Activity | 2024 |
| Description | Poster presentation - LRUA - Quantification of splicing variation at the single cell level during neuronal differentiation |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Postgraduate students |
| Results and Impact | Poster presentation at the Long-Read Sequencing Uppsala (LRUA) conference showcasing the advances of the project. Poster led to discussions with other attendees and contact information was shared. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.lrua2024.se |
| Description | Presentation - Invited Seminar - University College Dublin - Characterization of novel splicing events across tissues and during cellular differentiation |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Invited seminar at the School of Biology and Environmental Science University College Dublin |
| Year(s) Of Engagement Activity | 2023 |
| Description | Presentation - Invited seminar - Norwich Cancer Research Network |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | Local |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation of the CELLGEN ISP and Single cell genomics platform to the NCRN |
| Year(s) Of Engagement Activity | 2024 |
| Description | Presentation - Invited talk -BioIndustry Association - Genomics Advisory Committee - Characterisation of novel potential targets through the characterisation of novel splicing events |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Industry/Business |
| Results and Impact | Presentation as part of the Q1 genomics advisory committee meeting (BioIndusrty Association) |
| Year(s) Of Engagement Activity | 2024 |
| Description | Presentation - LongTrec meeting - Single cell sequencing with PACBIO |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Postgraduate students |
| Results and Impact | Presentation of library preparation and data analysis of single cell long read sequencing to graduate students as part of the Maric Curie LongTrec network |
| Year(s) Of Engagement Activity | 2023 |
| Description | Presentation - Nucleic Acids Therapy Accelerator |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Industry/Business |
| Results and Impact | Presentation to the research team of the Nucleic Acids Therapy Accelerator following meeting with Director at BIA meeting, discussion of interactions between NATA and EI, NATA showing interest in the platforms and services from EI. Discussion of potential collaborations with NATA around splicing regulation and setting up pilot projects |
| Year(s) Of Engagement Activity | 2024 |
| Description | Presentation - Oral Presentation - Characterization of novel splicing events across tissues and during cellular differentiation - WCBR |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation at the Winter Conference on Brain Research as part of the session Fostering Successful Partnerships Between Academia and Industry |
| Year(s) Of Engagement Activity | 2024 |
| Description | Presentation - Quantitative gene profiling of long noncoding RNAs with targeted RNA sequencing |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Postgraduate students |
| Results and Impact | Presentation to the LongTrec Consortium |
| Year(s) Of Engagement Activity | 2023 |
| Description | Presentation to the Kick-off meeting of LongTrec - Single cell sequencing on PACBIO |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Postgraduate students |
| Results and Impact | Attendance and presentation to the LongTrec kickoff meeting in Valencia |
| Year(s) Of Engagement Activity | 2023 |
| Description | Workshop - LRUA - Long-read transcriptomics: workflow and applications |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Postgraduate students |
| Results and Impact | Workshop organised by LongTREC at the Long-Read Sequencing Uppsala (LRUA) conference. Presentations led to a Q&A session with further discussions of attendees about long-read transcriptomics. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.lrua2024.se/program/ |
