Direct sequencing of modified nucleotides in viral, host, and therapeutic RNAs

Lead Research Organisation: University of Nottingham
Department Name: School of Veterinary Medicine and Sci

Abstract

Understanding and exploiting messenger RNA requires techniques to assay beyond the raw sequence, such as the modifications occurring to mRNA and defining its life stages. A step change is needed towards "big data" to enable mapping of multiple modifications to individual transcripts and isoforms at the whole-transcriptome level.

The realisation that mRNA nucleotide modifications are dynamically written, erased, and recognised by specific proteins has driven a paradigm shift in RNA biology. There are over 160 modifications thus far identified on RNA1 with fundamental roles in all stages of the RNA life cycle, gene expression, and disease2,3. The emerging field of Epitranscriptomics seeks to characterise these enigmatic modifications. Many viruses encode their own or capture host modification writers to evade or exploit host mechanisms, whilst therapeutic RNA must do the same in order to fine-tune the activities of the molecule in the host. Greater understanding of modification deposition and biological impact expands the toolkit from which we can combat viral infections, or fine-tune mRNA therapeutics.

Milestone 1: Student develops expertise in nanopore sequencing, oligonucleotide synthesis, and established assays. Established methods use radio-labelling or antibodies to specific modified nucleotides to characterise the epitranscriptome. Nanopore direct RNA sequencing can reveal complexities of mRNA processing in full-length single molecule reads. Nanopore algorithmically interpret signals resulting from multi-nucleotide "kmers" occupying the pore to infer sequence as strands progress through in a "one-out-one-in" fashion4. As a result, whilst RNA can be read "directly", fidelity is lacking compared to Illumina sequencing5. As a result, most RNA-sequencing strategies rely on conversion to cDNA in a bias-laden process6 which eliminates modification information.

Milestone 2: Student develops double stranded RNA-seq (dsRNA-seq) whereby target RNA is copied and the complementary RNA strand joined to the template at one end utilising a viral RNA-dependent RNA polymerase. Each assayed nucleotide is paired to an unmodified complement. Long read-lengths enabled by nanopore sequencing, and achieved in the Loose lab7, can then provide two reads for each RNA molecule assayed. The copied RNA will confirm the unmodified identity of the sense strand. In the context of modifications, any deviations from the expected nanopore "squiggle" acquired from the complementary RNA (cRNA) strand will signify a potential modification site. Novel training sets will also be generated. Enabled by the synthesis techniques developed in the Hayes lab8, in a single reaction, an oligonucleotide kmer population can be generated with a modified nucleotide flanked by four non-modified nucleotides in all (65,536) sequence combinations. Concatenating these prior to subjecting to dsRNA-seq will give both the novel kmer "squiggle" and confirm the kmer in the complementary read.

Milestone 3: Student utilises established methods and dsRNA-seq to detect modified nucleotides in mRNA of host and viral origin. This will provide new fundamental insights into correlations between the presence of modifications with distal splice site or processing variants.

Milestone 4 and placement: Student applies expertise and methods developed in milestones 1-3 in collaboration with AstraZeneca. This will compliment ongoing work at AstraZeneca in developing treatments based on proprietary RNA technologies. This requires the generation of mRNAs with multiple types of modifications throughout their length, to fine tune host metabolism of the molecule.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
BB/T008369/1 01/10/2020 30/09/2028
2593525 Studentship BB/T008369/1 01/10/2021 30/09/2025