Dynamic, adaptive sampling during nanopore sequencing using Bayesian experimental design
Nature Biotechnology 2 January 2023
A new method developed by EMBL-EBI researchers helps to streamline nanopore sequencing in real-time
Long-read nanopore sequencing has revolutionised the way scientists obtain genomic data. But like any new technology, there is always room for improvement. BOSS-RUNS – which stands for Benefit-Optimising Short-term Strategy for Read Until Nanopore Sequencing – is an open source method developed by researchers at EMBL’s European Bioinformatics Institute (EMBL-EBI) and the University of Nottingham that can help scientists to dynamically adapt their nanopore sequencing runs to make the process faster and more efficient.
Nanopore sequencing allows researchers to carry out real-time sequencing of long DNA or RNA fragments. It works by monitoring changes to an electrical current as nucleic acids – the building blocks of DNA and RNA – are passed through a protein nanopore. The resulting signal is computationally decoded to give the specific DNA or RNA sequence.
A unique feature of nanopore sequencing is the ability to reject DNA fragments passing through the pore by reversing the voltage to drive DNA back out. This allows scientists to select specific DNA fragments to sequence from within a mixed sample, a feature known as adaptive sampling or ‘Read Until’. In a new study published in the journal Nature Biotechnology, scientists describe a new method – BOSS-RUNS – that streamlines adaptive sampling by helping the user make real-time dynamic decisions on what they want to sequence.
“The ability to select individual molecules to sequence in real time has always been incredibly exciting,” said Matt Loose, Professor of Developmental and Computational Biology at the University of Nottingham. “Here with EMBL-EBI we have taken a step forward by enabling dynamic selection of molecules in response to what has already been sequenced. This is, to our knowledge, the first example of dynamic adaptive sampling.”
It’s not always necessary to sequence everything in a given sample. If a researcher is only interested in a specific site or region within a genome, they could limit their sequencing to that specific site. This is faster and enables researchers to prevent wasteful data acquisition and storage.
Nanopore sequencing’s adaptive sampling feature makes it possible to select in advance the molecules you wish to sequence, for example specific chromosomes or DNA from a specific species, in a complex sample. BOSS-RUNS takes advantage of this feature to help the user make these decisions based on the results they are getting in real-time. This allows for more dynamic sequencing efforts and better coverage of specific areas of the genome.
“One of the great things about nanopore sequencing is that the software is open source,” said Nick Goldman, Group Leader at EMBL-EBI. “This means that researchers can adapt their sequencing protocols, optimising them to best fit their needs. The way that sequencing devices traditionally work is quite wasteful in that they randomly sample the DNA being sequenced. This generates excessive amounts of data that isn’t always needed. Adapting the sequencing software itself can help to minimise this and save researchers time, money, and data storage.”
BOSS-RUNS allows the user to delve deeper into areas of a genome based on what they see in real-time. For example, if BOSS-RUNS detects locations in a genomic sample that don’t entirely match a reference genome, it can adjust the sequencing experiment to obtain more data specific to the region in question, to confirm this genomic variation.
Similarly, BOSS-RUNS can be ideal to use when analysing multiple genomes in the same sample, for example in a microbiome. If these genomes are from different species and present at different abundances, using this method will help researchers collect sufficient information on all the species present. BOSS-RUNS does this by informing itself in real-time about which species have already been sequenced and using this information to reject redundant DNA moving through the pore.
“We used BOSS-RUNS to analyse the species present in a mixed microbial community to show that the method can help researchers gain higher coverage depth of low-abundance species,” said Lukas Weilguny, Predoctoral Fellow at EMBL-EBI. “In a sample of mixed microbial species you may find that 90% of that sample is all the same species but you still want information on the species that make up perhaps only 1% or less. BOSS-RUNS figures out which species the sequencing reads come from in real-time and refocuses the experiment on species that haven’t been covered in as much depth.”
Try out BOSS-RUNS for yourself; the method is implemented in python and available in GitHub.
This research was funded by EMBL core funding and the BBSRC.
Nature Biotechnology 2 January 2023
Looking for past print editions of EMBLetc.? Browse our archive, going back 20 years.EMBLetc. archive