Our zlib-rs project implements a memory-safe and performant drop-in replacement for zlib, a widely-used data compression library.

@folkertdev shares the status quo of zlib-rs, including the good news that performance for the highest compression level is on par with the zlib-ng fork of zlib.

Read the blog for all the details:

https://tweedegolf.nl/en/blog/134/current-zlib-rs-performance

@trifectatech

#rustlang #datacompression #opensource

Current zlib-rs performance - Blog - Tweede golf

Our zlib-rs project implements a drop-in replacement for libz.so, a dynamic library that is widely used to perform gzip (de)compression.

@tweedegolf compiling with "target-cpu=native" is not an option for Linux Distributions (builder architecture != target architecture). would it be possible to make it use runtime CPU feature detection instead? or do you only apply this setting to affect optimizations done by LLVM, but no actual CPU-specific intrinsics are used in the code?

@decathorpe @tweedegolf

We already use runtime CPU feature detection, e.g. here:

https://github.com/memorysafety/zlib-rs/blob/85bc778044f173bfdc934f2ab731eb3f94cdf70f/zlib-rs/src/adler32.rs#L9-L21

The advantage of `target-cpu=native` is that those branches are compiled away, because it is statically known what features are available and hence which path will be taken.

Runtime CPU feature detection has a performance cost, and we're still looking for the most performant way to do it, but it totally works.

zlib-rs/zlib-rs/src/adler32.rs at 85bc778044f173bfdc934f2ab731eb3f94cdf70f · memorysafety/zlib-rs

A safer zlib. Contribute to memorysafety/zlib-rs development by creating an account on GitHub.

GitHub
@folkertdev @decathorpe
Why not use something like https://github.com/ronnychevalier/cargo-multivers ? There it's just a check at startup that then decompresses the correct version for the local CPU and applies some binary patches
GitHub - ronnychevalier/cargo-multivers: Cargo subcommand to build multiple versions of the same binary, each with a different CPU features set, merged into a single portable optimized binary

Cargo subcommand to build multiple versions of the same binary, each with a different CPU features set, merged into a single portable optimized binary - ronnychevalier/cargo-multivers

GitHub
@flamion @folkertdev that sounds like a nightmare (albeit an interesting idea). but no thank you, not something we can do in a distribution context either ;)
@decathorpe @folkertdev I mean, in a distribution context you could just build the binary using cargo multivers though, as it does all of that for you, then produces a single output binary, right? At least for Rust programs.
@flamion I don't think integrating another tool here would be worth it. there's already standardized support for loading libraries optimized for different microarchitecture levels built into the ELF loader (glibc HWCAPS), so if it's really essential for good performance, that's what we would use