Looking at some plasmidsaurus generated pod5 files - bro, 15gb archive decompresses into a 15gb pod5 file. Why are they just adding on processing time for the user? 🤔
@naturepoker it may be a requirement by a database. I recently uploaded pod5s for a publication to ENI and they required a tar archive.
@mzdravkov I think so too. Though being a commercial service people pay for (they charge extra for pod5 download) I'd appreciate some extra compression!
@naturepoker @mzdravkov pod5 are very efficiently compressed internally so adding anything in top would add very little
https://github.com/nanoporetech/vbz_compression
Note, you can mount tar archives
https://github.com/mxmlnkn/ratarmount
GitHub - nanoporetech/vbz_compression: VBZ compression plugin for nanopore signal data

VBZ compression plugin for nanopore signal data. Contribute to nanoporetech/vbz_compression development by creating an account on GitHub.

GitHub
@lpryszcz @naturepoker yeah, as far as I know, if you need extra compression you probably have to go to lossy-compression. You may find this paper interesting, if you haven't seen it: https://genome.cshlp.org/content/35/7/1574.short
A new compression strategy to reduce the size of nanopore sequencing data

An international, peer-reviewed genome sciences journal featuring outstanding original research that offers novel insights into the biology of all organisms

@mzdravkov @naturepoker yes, I've seen it. Impressive work by @hasindu2008 et al :)
@lpryszcz @mzdravkov @hasindu2008 this is all very good to know, thank you for the heads up!