Real Time Nanopore Sequencing at Scale - Read Until and Run Until on PromethION scale devices.

Lead Research Organisation: University of Nottingham
Department Name: School of Life Sciences

Abstract

Rapid sequencing of nucleic acids transformed our understanding of the biological world. New technologies and methods provide new challenges developing optimal approaches to address biological questions. Only Oxford Nanopore Technologies (ONT) instruments enable real time sequencing where sequence data can be analysed during data generation. Nanopore sequencing works by measuring current flow as DNA passes through a small nanopore. Current flow depends on the sequence of DNA present and so generates a current trace which can be decoded revealing the DNA sequence.

We have exploited this technology in a number of ways, notably establishing "read until". Here the start of a molecule is analysed whilst being sequenced. If the molecule matches a region of interest, sequencing can continue. If not the voltage over the specific channel can be reversed and the read rejected. In principle, this enables selective sequencing of specific molecules from a library. We were the first to demonstrate that this method works using dynamic time warping to map the nanopore squiggle to a reference. Subsequently, working through an iCASE studentship between ONT and Nottingham, we have jointly developed an approach to basecall sequence data directly from the squiggle trace whilst sequencing. This method removes limitations of searching within signals and can be readily used by others with access to the same hardware. At present, the method is limited to a specific subtype of ONT sequencers (the GridION and MinION). These platforms have the capacity to generate 10-20 Gb of data per flowcell. The PromethION sequencer can generate in excess of 100 Gb of sequence data per flowcell from up to 48 positions simultaneously. This enables sequencing of a human genome within 60 hours on a single flowcell with long reads. In principle, selective sequencing at this scale could enable many useful applications such as a low coverage whole genome coupled with higher coverage of specific regions such as known cancer-causing SNPs or other structural variants of interest.

However, "read until" has a negative impact on the flowcell performance (particularly on the PromethION) and so experimental design requires optimisation depending on the precise experimental goal. We are working with ONT to determine the underlying reasons for this and develop methods to implement "Read Until" on higher throughput platforms. In this studentship we will develop methods to implement two real time approaches around nanopore sequencing. 1) Run Until, whereby sequencing can be automatically halted once a specific experimental goal has been achieved and, 2), coupling this iterative type of approach with 'read until' such that once a specific goal has been achieved, reads can be rejected which are no longer required. Methods will include genome assembly, targeted coverage of regions of whole genomes, and filtering of metagenomes. We will also look to establish methods that can exploit one or more sequencing positions simultaneously or sequentially. Crucially, these methods must be dynamic and offer flexibility over and above custom library preparation which might enrich for specific targets.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
BB/T008369/1 01/10/2020 30/09/2028
2432086 Studentship BB/T008369/1 01/10/2020 30/09/2024