poRe: visualization and analysis of nanopore sequencing data for rapid and mobile pathogen detection

Lead Research Organisation: University of Edinburgh
Department Name: The Roslin Institute

Abstract

The context of the research: Nanopore sequencing represents a paradigm shift in DNA sequencing, and today it is the only sequencing technology that measures an actual single molecule of DNA, rather than incorporation events into a template strand. In nanopore sequencing, a protein nanopore is attached to a membrane and changes in electronic signal are measured as single molecules of DNA pass through the pore. Oxford Nanopore Technologies (ONT) are a British company who are the first to bring to market a commercial nanopore sequencer. The MinION is the world's first mobile DNA sequencing device, measuring 4" in length and powered by the USB port of a laptop. The MinION is a revolutionary device, and requires a new suite of bioinformatics tools to help work with the data.

Aims and objectives: we are one of the first groups in the world to develop and publish a software tool for MinION data. poRe is a package for R that enables users to organise, visualise and analyses MinION nanopore sequencing data. We aim to develop the software further to provide real-time analysis, parallelisation of the code, development of data standards, implementation of a graphical user interface and integration of alignment tools. The latter will be used specifically in the context of veterinary pathogen detection from clinical samples.

Potential applications and benefits: we have included letters of support from our users in both UK and international institutes. We believe the future of the MinION is mobile DNA sequencing and thus we have developed poRe to work on the Windows laptop attached to the device, and be easy to install and use. Thus poRe will help enable true mobile sequencing, allowing users to carry out real time sequencing in the field or by the bedside.

Technical Summary

Nanopore sequencing represents a paradigm shift in DNA sequencing, and today it is the only sequencing technology that measures an actual single molecule of DNA, rather than incorporation events into a template strand. In nanopore sequencing, a protein nanopore is attached to a membrane and changes in electronic signal are measured as single molecules of DNA pass through the pore. Oxford Nanopore Technologies (ONT) are a British company who are the first to bring to market a commercial nanopore sequencer. The MinION is the world's first mobile DNA sequencing device, measuring 4" in length and powered by the USB port of a laptop. The MinION is a revolutionary device, and requires a new suite of bioinformatics tools to help work with the data.

The MinION presents several technical challenges. Firstly, all sequence files are written to a single directory, regardless of the run or date. All metadata about the run is embedded within the read files themselves, which are in HDF5 format, a hierarchical binary format. The MinION has 512 channels, each designed to hold a single nanopore. Channels write data asynchronously, which means that, unlike other technologies, data can be analysed as soon as it is produced. We have written poRe, one of the first tools that allows users to organise, analyse, extract and visualise MinION data.

In this application we intend to further develop poRe, to develop tools for real-time analysis, parallelisation of the code, development of data standards, implementation of a graphical user interface and integration of alignment tools. The latter will be used specifically in the context of veterinary pathogen detection from clinical samples.

Planned Impact

This project will develop unique new features for our software poRe, which is in use in labs throughout the world. Our software allows researchers, both biologists and bioinformaticians, to organise, visualise and analyse MinION nanopore sequencing data. The MinION is a revolutionary device and represents the World's first mobile DNA sequencer. As with any new sequencing technology, the MinION requires new software tools that allow users to interact with the data. poRe is one such software tool.

As our letters of support suggest, developing poRe will benefit a huge range of researchers, from graduate students to established professors. A key feature of the software is that it is easy to install and easy to use, therefore our software appeals to biologists who would not normally be able to work with MinION due to the complex data formats produced.

The future of the MinION is mobile DNA sequencing, and we anticipate it will be used in the field and by the bedside (i.e. outside of a laboratory) for real-time sequencing. Therefore we have developed poRe to work on a Windows laptop attached to the device, without the need to access remote servers. Obvious applications include vets carrying our real time sequencing in barns and fields for real-time detection if pathogen DNA, and nurses and doctors doing the same by the bedside.

Having said that, poRe is also useful in the traditional sense, and is being used by many labs. The beneficiaries of this work will include academic scientists, biotechnology companies, health companies and the health services, vet schools and veterinary practices, the pharmaceutical industry, national governments and the general public.
 
Description poRe is one of the first tools to enable researchers to use MinION nanopore sequencing data, this is an R package that enables users to QC sequencing runs and extract information from them. The package is used across industry and academia. The package was published in 2015 (10.1093/bioinformatics/btu590) and we were funded by BBSRC to develop it further (BB/M020037/1), resulting in continued support and the development of GUI interfaces (10.1101/094979)

poRe has had 23 releases to date and boasts over 6800 downloads. Google Scholar lists 53 citations for the original software paper.

B fragilis: we were one of the first groups to demonstrate a hybrid assembly of Oxford Nanopore and Illumina sequencing data and produce a single contig representing the bacterial chromosome (10.1186/s13742-015-0101-6). Bacteroides fragilis is a gram-negative, obligate anaerobic bacterium that is commensal in the human colon; however it is also an opportunistic pathogen and is a major cause of soft tissue infections. The software also established Mick Watson as a leader in the field, leading to opportunities for presentation at international conferences (such as London Calling, PAG), engagement in media activities, and training (both in the UK and internationally).
Exploitation Route The software is used in academia and industry throughout the world and has potential uses in medicine, biotechnology, biological research and environmental monitoring. We have evidence that poRe is used in the pharmaceutical industry, in biotech and in public bodies such as public health england (PHE)
Sectors Agriculture, Food and Drink,Education,Environment,Healthcare

URL https://sourceforge.net/projects/rpore/
 
Description The software has been used extensively in training post-docs and other researchers in how to use the MinION: 1) PoreCamp: training of 30 individuals in both lab-work and data analysis relating to the MinION (14th December - 18th December 2015) 2) MinION workshop: 2 x 1 day intensive workshops in MinION bioinformatics (2-3rd Feb 2016) 3) PoreCamp USA in 2017, training 30 researchers from across the US in use of the MinION 4) further Edinburgh Genomics training activities, including two BBSRC-funded STARS winter schools Further PoreCamps and MinION workshops are planned. We have evidence that the poRe software is enabling research in pharmaceutical companies, in biotechnology companies, and in public bodies such as Public Health England (PHE)
First Year Of Impact 2016
Sector Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Environment,Healthcare,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Description PoreCamp Texas
Geographic Reach North America 
Policy Influence Type Influenced training of practitioners or researchers
URL http://porecamp.github.io/texas/
 
Description Training in genomics
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact Our genomics training activity is open to all, including academia, industry, professionals and the public. We train people in software development and data analysis as pertains to the analysis of genomic data. Our training programme has so far consisted of 37 workshops in the last 2 years (each consisting of between 1 and 5 days hands-on training), and has provided advanced training in bioinformatics to 483 unique individuals, 47% of whom come from outside of the University of Edinburgh, spanning 15 countries including The Netherlands, India, Denmark, Belgium, Germany, Ireland, Luxembourg, Finland, Northern Ireland, Spain, Sweden, France, South Africa and Norway. Whilst the majority of our trainees are academics, we have also trained individuals from private companies and the NHS. At present we do not collect detailed statistics about training level, though we are certain that the vast majority of our trainees are post-docs and PhD students in need of vital expertise to complete their training/projects. The data underlying this report are available upon request, but should not be shared widely as they contain sensitive information.
URL http://genomics.ed.ac.uk/services/training
 
Description Comprehensive training in computational biology techniques for analyzing second and third generation sequencing data
Amount £28,000 (GBP)
Funding ID BB/N019636/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 05/2015 
End 05/2018
 
Description PoreCamp 
Organisation University of Birmingham
Country United Kingdom 
Sector Academic/University 
PI Contribution PoreCamp is a training programme in nanopore sequencing run via a collaboration between Universities of Edinburgh, Birmingham and Nottingham
Collaborator Contribution The first PoreCamp was a training course run for users of the Oxford Nanopore MinION, taking place over 5 days in December 2015. Users were trained in library preparation, sequencing and bioinformatics. A second PoreCamp occurred in Exeter in 2016. A third PoreCamp occured in Texas in 2017.
Impact Well over 200 students trained to date
Start Year 2015
 
Description PoreCamp 
Organisation University of Nottingham
Country United Kingdom 
Sector Academic/University 
PI Contribution PoreCamp is a training programme in nanopore sequencing run via a collaboration between Universities of Edinburgh, Birmingham and Nottingham
Collaborator Contribution The first PoreCamp was a training course run for users of the Oxford Nanopore MinION, taking place over 5 days in December 2015. Users were trained in library preparation, sequencing and bioinformatics. A second PoreCamp occurred in Exeter in 2016. A third PoreCamp occured in Texas in 2017.
Impact Well over 200 students trained to date
Start Year 2015
 
Title poRe: an R package for the visualization and analysis of nanopore sequencing data 
Description Motivation: The Oxford Nanopore MinION device represents a unique sequencing technology. As a mobile sequencing device powered by the USB port of a laptop, the MinION has huge potential applications. To enable these applications, the bioinformatics community will need to design and build a suite of tools specifically for MinION data. Results: Here we present poRe, a package for R that enables users to manipulate, organize, summarize and visualize MinION nanopore sequencing data. As a package for R, poRe has been tested on Windows, Linux and MacOSX. Crucially, the Windows version allows users to analyse MinION data on the Windows laptop attached to the device. Availability and implementation: poRe is released as a package for R at http://sourceforge.net/projects/rpore/. A tutorial and further information are available at https://sourceforge.net/p/rpore/wiki/Home/ 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact NA 
URL http://sourceforge.net/projects/rpore/
 
Description Aviagen / CP workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact I presented our work on functional microbiome analysis during a one-day workshop which I set up and organised at The Roslin Institute. In attendance were employees of CP (a large Asian conglomerate) and Aviagen (one of the world's largest chicken breeding companies). The focus of the workshop was animal genetics and microbiome.
Year(s) Of Engagement Activity 2010,2017
 
Description Evonik research day 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact I presented my work on precision analysis of microbiomes to Evonik, an international company with over 13000 employees and with interests in chemical and food production. This was part of a one day workshop with Evonik, hosted by Roslin and focused on microbiomes
Year(s) Of Engagement Activity 2018
 
Description Fetival of Genomics 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Mick Watson presented at a workshop, hosted by Edinburgh Genomics, on software for the handling and analysis of MinION sequencing data
Year(s) Of Engagement Activity 2016
 
Description ISAG 2017 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Mick Watson presented work on cutting edge techniques that can be used to analyse metagenomics sequence data. ISAG is the international society for animal genetics and this was the very first microbiome session. In attendance were industry practitioners and academics
Year(s) Of Engagement Activity 2017
URL http://www.isag.us/2017/
 
Description MIck Watson comments on Genia 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Mick Watson published a blog post on the patent dispute between Genia and the Unioversity of California, which was subsequently picked up by GenomeWeb, a website and news outlet followed by genomics investors, scientists and industry professionals
Year(s) Of Engagement Activity 2016
URL https://www.genomeweb.com/sequencing/university-california-files-suit-against-genia-cofounder
 
Description Mick Watson comments on MinION 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Mick Watson was interviewed by the local press in Texas and this was then covered by multiple international news outlets. The focus was the MinION, a handheld sequencer capable of producing large amounts of DNA sequence data quickly and cheaply, and on which Mick Watson is considered an expert.
Year(s) Of Engagement Activity 2017
URL https://www.research.ed.ac.uk/portal/en/clippings/mick-watson-helps-to-develop-minion-a-handheld-dev...
 
Description Mick Watson comments on MinION sequencing of human genome 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Mick Watson provided quotes and comments to a STAT news article on the recent MinION sequencing of a whole human genome
Year(s) Of Engagement Activity 2018
URL https://www.statnews.com/2018/01/29/hand-held-dna-sequencer-minion/
 
Description Mick Watson comments on NGS market 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Mick Watson comments on the NGS market and the future of technologies such as the PacBio Sequel, Illumina and the Oxford Nanopore MinION
Year(s) Of Engagement Activity 2015
URL https://www.genomeweb.com/sequencing-technology/technology-improvements-long-reads-lead-ngs-market-a...
 
Description PoreCamp 2015 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact PoreCamp was a training course run for users of the Oxford Nanopore MinION, taking place over 5 days in December 2015. Users were trained in library preparation, sequencing and bioinformatics. A second PoreCamp occurred in Exeter in 2016
Year(s) Of Engagement Activity 2015
URL https://porecamp.github.io/