The #Library #Innovation Lab #Harvard has just launched #WARCbench, which they describe as a "Swiss Army Knife for #WARC Processing". This is a high benchmark because I frickin' love Swiss Army knives! 😀But, on a serious note, this WARCbench looks seriously useful. As someone who has recently been working with colleagues in this space, I have been shocked at the lack of decent tooling. WARCbench is therefore a brilliant #opensource contribution for us all!

https://lil.law.harvard.edu/blog/2026/06/09/warcbench-a-swiss-army-knife-for-warc-processing/ #archiving

WARCbench: A Swiss Army Knife for WARC Processing | Library Innovation Lab

The Library Innovation Lab is growing knowledge and community by bringing library principles to technological frontiers.

The Library Innovation Lab at Harvard University
@g3om4c thanks for sharing this. It is interesting to see that, despite being written in Python, it implements its own warc parser, rather than using warcio. I think there is some value in doing that here. It also reminds me a bit of the original aims of ArchivesUnleashedToolkit that kind of got more expansive, and (ultimately) servicified, which took it away from the commandline: https://aut.docs.archivesunleashed.org/docs/home
The Toolkit · Archives Unleashed Toolkit

The Archives Unleashed Toolkit is an open-source platform for analyzing web