Adaptive sampling ('Read Until') methods in optimised nanopore sequencing technologies
Lead Research Organisation:
University of Nottingham
Department Name: School of Life Sciences
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Technical Summary
We propose to develop algorithms to enable adaptive sampling of DNA in real time by exploiting the unique property of nanopore sequencers, that data are streamed from nanopores and that the Oxford Nanopore Technology minION device allows the specific molecules to be ejected from a nanopore at any time, regardless of how completely it has been read. For this, two linked, but distinct, problems must be solved: The DNA molecule (represented by changes in current flow) must be mapped rapidly to a reference and an accept/reject decision must be made based on accumulated previous mapping events. We will address both of these problems using five model cases of direct relevance to BBSRC science:
1. Rapid even coverage in bacterial genome sequencing (e.g. pathogen identification in food-borne disease)
2. Even coverage in diploid genome resequencing (e.g. marker and variant discovery in livestock welfare and breeding)
3. Sequencing of genomic regions of interest that are recalcitrant to conventional sequencing (e.g. in crop plant genomics)
4. Maximising discovery and quantification of low-abundance transcripts (e.g. in fish pathogen response transcriptomics)
5. Coordination of multi-sample sequencing in complex mixtures (e.g. in comparative metagenomics studies)
To achieve rapid matching of early read data to reference sequence we will explore several indexing/pre-computing strategies, including Fast Fourier Transform of streamed data; wavelet transform of the stream followed by indexing; discretisation of the signal and suffix tree or FM-index processing. This tool would run on the laptop local to the sequencer. In contrast, the logical process for accepting or rejecting specific reads will be managed by an external server system running appropriate pipelines on the minoTour minION analysis platform. Templates will be generated for minoTour allowing experienced users to generate pipelines for further specific use cases.
1. Rapid even coverage in bacterial genome sequencing (e.g. pathogen identification in food-borne disease)
2. Even coverage in diploid genome resequencing (e.g. marker and variant discovery in livestock welfare and breeding)
3. Sequencing of genomic regions of interest that are recalcitrant to conventional sequencing (e.g. in crop plant genomics)
4. Maximising discovery and quantification of low-abundance transcripts (e.g. in fish pathogen response transcriptomics)
5. Coordination of multi-sample sequencing in complex mixtures (e.g. in comparative metagenomics studies)
To achieve rapid matching of early read data to reference sequence we will explore several indexing/pre-computing strategies, including Fast Fourier Transform of streamed data; wavelet transform of the stream followed by indexing; discretisation of the signal and suffix tree or FM-index processing. This tool would run on the laptop local to the sequencer. In contrast, the logical process for accepting or rejecting specific reads will be managed by an external server system running appropriate pipelines on the minoTour minION analysis platform. Templates will be generated for minoTour allowing experienced users to generate pipelines for further specific use cases.
Planned Impact
The application of sequencing technologies underpins much of biological research today. Our approach, adaptive sampling in nanopore-based sequencing, serves to eliminate coverage bias and focus resolving power and thus has numerous beneficiaries. Within the broad UK and global academic and applied science communities these methods will benefit both those already using, and those yet to use, sequencing methods.
The direct impacts of our work will be delivered as an enabling software technology that allows broad use of adaptive sampling. During the project we will specifically demonstrate the technology in five areas of biological research and application, each of which represents a challenge area for current sequencing approaches. These are the rapid sequencing of bacterial pathogens for identification, typing and resistance profiling purposes (demonstrating coverage control in diploid genome sequencing), marker and variant discovery in livestock resequencing (even coverage in diploid genome sequencing), access to regions that are difficult to sequence in higher plants, particularly the crop species (targeted genomic region sequencing), pathogen response transcriptome characterisation and profiling in farmed fish species (low-abundance transcript sequencing) and comparative metagenomics (coverage/focus control in multi-sample sequencing). We expect direct impact on groups of researchers who use sequencing approaches in these areas, including, but not limited to, those who have expressed support for the project (see letters of support).
Through the capacity to eliminate coverage bias, sequencing costs will be reduced, making sequencing available to areas of research and application for which cost remains prohibitive (such as deep population biology of crops, the discovery of low frequency variant alleles for livestock breeding programmes and the profiling of expression in non-model species). Through the ability to focus on defined regions, adaptive sampling will bring powerful methods to areas such as ecology and biodiversity (barcoding, whole-ecosystem analysis, occurrences and abundance), environmental sensing (water safety, environmental health, sentinel markers for pollution and climate change), food chain control (food species/breed/line validation, forensic tracking), border and trade control (invasive species, illegal trade in controlled species), bioenergy (investigation of new species, yield improvement), public health (environmental and zoonotic pathogen sinks, epidemiology of anti-microbial drug resistance) and animal health (surveillance, outbreak detection, transmission control).
The UK has long been established at the forefront of sequencing technology and the application of adaptive sampling methods to nanopore technologies will serve to continue this trend.
The direct impacts of our work will be delivered as an enabling software technology that allows broad use of adaptive sampling. During the project we will specifically demonstrate the technology in five areas of biological research and application, each of which represents a challenge area for current sequencing approaches. These are the rapid sequencing of bacterial pathogens for identification, typing and resistance profiling purposes (demonstrating coverage control in diploid genome sequencing), marker and variant discovery in livestock resequencing (even coverage in diploid genome sequencing), access to regions that are difficult to sequence in higher plants, particularly the crop species (targeted genomic region sequencing), pathogen response transcriptome characterisation and profiling in farmed fish species (low-abundance transcript sequencing) and comparative metagenomics (coverage/focus control in multi-sample sequencing). We expect direct impact on groups of researchers who use sequencing approaches in these areas, including, but not limited to, those who have expressed support for the project (see letters of support).
Through the capacity to eliminate coverage bias, sequencing costs will be reduced, making sequencing available to areas of research and application for which cost remains prohibitive (such as deep population biology of crops, the discovery of low frequency variant alleles for livestock breeding programmes and the profiling of expression in non-model species). Through the ability to focus on defined regions, adaptive sampling will bring powerful methods to areas such as ecology and biodiversity (barcoding, whole-ecosystem analysis, occurrences and abundance), environmental sensing (water safety, environmental health, sentinel markers for pollution and climate change), food chain control (food species/breed/line validation, forensic tracking), border and trade control (invasive species, illegal trade in controlled species), bioenergy (investigation of new species, yield improvement), public health (environmental and zoonotic pathogen sinks, epidemiology of anti-microbial drug resistance) and animal health (surveillance, outbreak detection, transmission control).
The UK has long been established at the forefront of sequencing technology and the application of adaptive sampling methods to nanopore technologies will serve to continue this trend.
People |
ORCID iD |
Matthew Loose (Principal Investigator) |
Publications

Jain M
(2018)
Nanopore sequencing and assembly of a human genome with ultra-long reads.
in Nature biotechnology

Jain M
(2017)
MinION Analysis and Reference Consortium: Phase 2 data release and analysis of R9.0 chemistry.
in F1000Research

Koren S
(2019)
Reply to 'Errors in long-read assemblies can critically affect protein prediction'.
in Nature biotechnology

Loose M
(2018)
Finding the Needle: Targeted Nanopore Sequencing and CRISPR-Cas9.
in The CRISPR journal

Loose MW
(2017)
The potential impact of nanopore sequencing on human genetics.
in Human molecular genetics

Munro R
(2022)
minoTour, real-time monitoring and analysis for nanopore sequencers.
in Bioinformatics (Oxford, England)
Title | LED display for sequencing. |
Description | Nanopore sequencing is often visualised as an array of channels, each of different colours. In this display we develop an interface to show the dynamics of sequencing within an LED matrix, |
Type Of Art | Artefact (including digital) |
Year Produced | 2016 |
Impact | This is really developed as an interactive illustration to demonstrate sequencing to undergraduate/school students. |
URL | https://github.com/mattloose/512array_Nanolights |
Description | In the project, we have made a number of advances. Nanopore platforms are developing quickly with longer reads and more rapid sequencing; we remain responsive to these advances, and can leverage to our advantage; in particular, we predict benefits from "Read Until" adaptive sample approaches that will be greater than we originally expected. Specific work has included a complete rebuild of our minoTour software application - the control system for Nanopore machines - which is now in the late phases of testing; the new system will enable rapid tracking of both real-time and base-called data from Oxford Nanopore Technology's (ONT) MinION and GridION instruments and, for now, monitoring of data from PromethION instruments in close to real time for a single flow cell. We are adapting our software to the new Application Programmatic Interface recently released by ONT and are soon to test this on a single chromosome selection on human material, for which the cells providing input DNA are currently growing. We have built and are about to release a "bulk file viewer", which enables the visual inspection of raw signal data for an entire channel in order to see the effects of Read Until on specific channels and check for reads which have been rejected successfully or not from a single pore. Finally, we have detected that reads in ONT's MinKNOW instrument control software are often split out incorrectly, falsely subdividing a DNA molecule into more than one read; our bulk file viewer will allow users to detect and repair under these error scenarios. |
Exploitation Route | We expect impacts of value to the UK and international bioscience community, through the delivery of software components that enable and empower those using nanopore sequencing. To date, we have built a number of software components, such as the re-written minoTour and the rebuilt file viewer, that will soon reach the public domain. These advances, along with performance improvements in the platform itself, will allow us to advance our impacts to specific communities through our five challenge "exemplars" addressing specific applications and communities, the first of which is currently being initiated. Through our communications with the user community, we have also identified new areas of challenge, such as field sequencing of viral samples for rapid identification in compute-limited contexts. |
Sectors | Aerospace Defence and Marine Agriculture Food and Drink Education Healthcare Manufacturing including Industrial Biotechology |
URL | https://github.com/looselab/readfish |
Description | The software that we developed has been integrated into Oxford Nanopore Technologies platforms, including releases and improvements to APIs, methods and processes. In addition there are now numerous papers, grants and new applications being delivered around our existing tool chain. |
First Year Of Impact | 2020 |
Sector | Healthcare,Manufacturing, including Industrial Biotechology |
Impact Types | Economic |
Description | A New Durable Read EXtension Method for Very, Very Long Reads |
Amount | £798,242 (GBP) |
Funding ID | 212965/Z/18/Z |
Organisation | Wellcome Trust |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 01/2019 |
End | 01/2022 |
Description | BBSRC iCASE |
Amount | £94,431 (GBP) |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2017 |
End | 09/2020 |
Description | From Comparative Genomics to Comparative Genetics - What is Required for Life Without DNA Replication Origins? |
Amount | £495,280 (GBP) |
Funding ID | BB/R007543/1 |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 07/2018 |
End | 07/2022 |
Description | Tool to identify pathogens in metagenomic long-read sequence data in real time |
Amount | £47,290 (GBP) |
Organisation | Defence Science & Technology Laboratory (DSTL) |
Sector | Public |
Country | United Kingdom |
Start | 03/2018 |
End | 03/2019 |
Description | Wellcome Prime Scholarship |
Amount | £45,000 (GBP) |
Organisation | Wellcome Trust |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 08/2017 |
End | 08/2019 |
Title | Adaptive Sampling Integration into MinKNOW |
Description | Our method for applying Adaptive Sampling was co-developed by a PhD student working on our adaptive sampling grant and written into Oxford Nanopores own implementation of Adaptive Sampling that is now shipping in MinKNOW. In essence, this allows a limited subset of functionality from our ReadFish research tools to be used by anyone relatively simply in MinKNO, Oxford Nanopores own GUI for controlling Nanopore sequencing. |
Type Of Material | Technology assay or reagent |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | These tools have been used in a number of papers of note to date and have enabled broad uptake of a new sequencing method in the community. |
URL | https://github.com/nanoporetech/read_until_api/releases |
Title | BulkVIS |
Description | BulkVIS is a tool for detailed analysis of raw signal data during Nanopore sequencing. This tool enables identification of longer reads than have previously been reported and more detailed understanding of how nanopore sequencing occurs. |
Type Of Material | Technology assay or reagent |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | The identification of the longest molecule sequenced to date. https://www.bbc.co.uk/news/science-environment-46046024 |
URL | https://github.com/LooseLab/bulkvis |
Title | DSTL Screening |
Description | We have been invited to implement a standalone version of the minoTour tool for use by specific individuals in the real-time identification of pathogens. |
Type Of Material | Technology assay or reagent |
Year Produced | 2018 |
Provided To Others? | No |
Impact | This is an ongoing project with expected completion in 2019. |
Title | MinoTour version 1 |
Description | MinoTour is a complete laboratory information management system for Nanopore sequencing. It also includes customisable real time analysis. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | This is a revision of a previously available tool and feeds in to several of our other projects. |
URL | https://github.com/looselab/minotourapp |
Title | Minotour Client |
Description | This is a python tool to upload data into our minoTour application. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | This is feeding in to many of our existing projects. |
URL | https://github.com/LooseLab/minotourcli |
Title | Read Until API updates |
Description | We have overhauled the Oxford Nanopore Read Until API |
Type Of Material | Technology assay or reagent |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | This tool will be partially integrated in to Oxford Nanopore Technologies own tools. |
URL | https://www.github.com/looselab/read_until_api_v2 |
Title | Read Until Scripts |
Description | This tool implements various methods for adaptive sequencing using a mix of our own tools and those provided by Oxford Nanopore. |
Type Of Material | Technology assay or reagent |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | These tools will be partially integrated into Oxford Nanopores own toolchain. |
URL | https://www.github.com/looselab/ru |
Title | SwordFish Adaptive Sampling |
Description | This tool enables adaptive sampling from our Nanopore monitoring tool MinoTour - the tool enables genuine adaptive sampling in a range of contexts including adaptive sampling for SCoV2 and human genomes. |
Type Of Material | Technology assay or reagent |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | This tool is new, but is likely ot have significant impact on the wider application of adaptive sampling. |
URL | https://github.com/LooseLab/swordfish/ |
Description | Read Until EBI |
Organisation | EMBL European Bioinformatics Institute (EMBL - EBI) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We have developed a website and interface for the analysis of minION data (minoTour) - we have also developed the first implementation of read until - selective sequencing on the minION sequencer. |
Collaborator Contribution | The EBI are world leaders in algorithm and storage development. |
Impact | Grant Submission to the BBSRC |
Start Year | 2016 |
Description | The Telomere-to-Telomere (T2T) consortium is an open, community-based effort to generate the first complete assembly of a human genome. |
Organisation | National Institutes of Health (NIH) |
Department | National Human Genome Research Institute (NHGRI) |
Country | United States |
Sector | Public |
PI Contribution | I have been contributing expertise, time and sequencing data to the activities of of the telomere-to-telomere consortium. The goal of this consortium is to sequence the first human genome from telomere-to-telomere. Our expertise through the Long Read Club has been exploited to enable this goal. |
Collaborator Contribution | Other partners have generated sequencing data, analysed and assembled reads and presented this work. |
Impact | No outputs to date. |
Start Year | 2019 |
Description | The Telomere-to-Telomere (T2T) consortium is an open, community-based effort to generate the first complete assembly of a human genome. |
Organisation | University of California, Santa Cruz |
Country | United States |
Sector | Academic/University |
PI Contribution | I have been contributing expertise, time and sequencing data to the activities of of the telomere-to-telomere consortium. The goal of this consortium is to sequence the first human genome from telomere-to-telomere. Our expertise through the Long Read Club has been exploited to enable this goal. |
Collaborator Contribution | Other partners have generated sequencing data, analysed and assembled reads and presented this work. |
Impact | No outputs to date. |
Start Year | 2019 |
Title | ReadFish Adaptive Sampling Toolkit |
Description | This suite of tools interacts with Nanopores sequencers to enable adaptive sampling of molecules in real-time via direct base calling. |
Type Of Technology | Software |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | These tools have been used by many groups to investigate rare disease amongst other approaches. |
URL | https://github.com/nanoporetech/read_until_api/releases |
Title | minotour v 1 |
Description | Minotour is a real time set of tools for analysis of nanopore data. |
Type Of Technology | Software |
Year Produced | 2019 |
Open Source License? | Yes |
Impact | This is being used across a number of our projects. |
URL | http://minotour.nottingham.ac.uk |
Description | Grand Challenges in Genomics - Invited Panel Speaker - Joint meeting of the NHGRI/Wellcome Trust, London, Feb 2019 |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Policymakers/politicians |
Results and Impact | Grand Challenges in Genomics was a meeting to discuss the next ten years of Genomics and the ways in which both NHGRI and the Wellcome Trust should target investment and funding in the future. |
Year(s) Of Engagement Activity | 2019 |
Description | Long Read Club |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Long Read Club is an informal grouping of users interested in exploring long read sequencing technologies in all their guises. We are raising awareness of methods, best practice and experience. This is being done through a website, twitter account and youtube channel. Over 900 have signed up to the email list, nearly 700 followers on twitter and over 130 people have subscribed to the youtube channel. |
Year(s) Of Engagement Activity | 2019 |
URL | http://youtube.com/c/longreadclub |
Description | Oxford Nanopore - Basecallng Consensus Hackathon - Invited Contributor - July (2018) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | An invitation only hackathon to investigate questions around base calling and sequence consensus. |
Year(s) Of Engagement Activity | 2018 |
Description | PoreCamp |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | PoreCamp is a training initiative established to teach the basics of Nanopore Sequencing to both academic and industrial users of sequencing. It is held approximately every six months and to date has run in Birmingham, Exeter and Australia. Future pore camps are planned in Texas, USA and the East Midlands, UK. |
Year(s) Of Engagement Activity | 2016,2017 |
URL | https://porecamp.github.io |
Description | PoreCamp Birmingham 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Porecamp is a world recognised Nanopore Training Camp. This week long activity provides comprehensive training and instruction in all aspects of Nanopore sequencing - from library preparation through to sequencing and analysis. I am a founder and lead instructor on this course. In Birmingham we produced a public information film describing our activities and interests in this area. |
Year(s) Of Engagement Activity | 2017 |
Description | PoreCamp Texas |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Porecamp is a world recognised Nanopore Training Camp. This week long activity provides comprehensive training and instruction in all aspects of Nanopore sequencing - from library preparation through to sequencing and analysis. I am a founder and lead instructor on this course. |
Year(s) Of Engagement Activity | 2017 |
Description | Singapore Genome Centre - Porecamp Singapore Training Course - Lead Instructor and Keynote - Sept (2018) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Porecamp is an instructional course for using nanopore sequencing in the lab and the field. It is open to all and serves to increase the uptake of nanopore sequencing globally. |
Year(s) Of Engagement Activity | 2018 |
Description | University of British Columbia - Porecamp Training Course - Lead Instructor and Keynote - May (2018) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Porecamp is a training course to encourage uptake of Nanopore sequencing in the field and laboratory. |
Year(s) Of Engagement Activity | 2018 |