VERY excited to share SeqTagger with the world! https://github.com/novoalab/SeqTagger/

If you want to learn more about our super-fast and accurate #demultiplexing tool for #direct #RNA #sequencing (DRS), check out the thread below. #nanopore
https://www.biorxiv.org/content/10.1101/2024.10.29.620808v1

We started developing SeqTagger which follows a very straightforward logic: Let’s demultiplex by basecalling the #DNA #barcode sequence (on an RNA pore) and aligning it to a set of reference barcode sequences.

GitHub - novoalab/SeqTagger: Super-fast and accurate demultiplexing of direct RNA-seq runs.

Super-fast and accurate demultiplexing of direct RNA-seq runs. - GitHub - novoalab/SeqTagger: Super-fast and accurate demultiplexing of direct RNA-seq runs.

GitHub
We benchmarked our tool on an independent test dataset and observed a precision of 99% with a recall of 95%.
Next, we examined whether SeqTagger scaled well with an increased number of barcodes, so we trained a model containing 96 barcodes, and obtained a highly accurate model (98.8%)!
This makes SeqTagger perfectly suited for the larger amounts of data, as those seen with RNA004. But does it work on the new chemistry? Yes! Reaching precisions of ≥ 99% with 97% recall.
However, not all DRS libraries are polyA-tailed, or the tails are very short. To this end, we also trained a model that is specific for Nano-tRNAseq libraries (https://www.nature.com/articles/s41587-023-01743-6), which allows users to investigate tRNA abundance in conjunction with their modification status in a multiplexed manner allowing for robust demultiplexing with minor loss in precision.
Quantitative analysis of tRNA abundance and modifications by nanopore RNA sequencing - Nature Biotechnology

tRNA abundance and chemical modifications are measured simultaneously with nanopores.

Nature

Taken-together SeqTagger is a highly-accurate and fast demultiplexing algorithm that works across sequencing devices (MinION + PromethION), chemistries (RNA002, RNA004) and RNA biotypes (mRNA, tRNA) enabling sequencing of up to 96 samples in a single flowcell.

This was an incredibly fun project which we hope will serve the community well.

Big thanks to all co-authors and lab members, especially Eva Maria Novoa and Gregor Diensthuber!