📦 Efficient file syncing for massive datasets? Forget classic rsync!

#BLAKE3 hashes instead of timestamps – hash 70GB in just ~20 seconds thanks to multithreading across all CPU cores

#DevOps #Linux #SysAdmin #OpenSource #FileSync

📊 Algorithm comparison for 70GB:
- BLAKE3: ~20 seconds
- MD5: ~1.5 minutes
- SHA256: ~4.5 minutes

🔍 The trick: Create a hash baseline, compare with `comm -13`, then use `rsync --files-from` to transfer only actually changed files

🧵 👇

🦀 Rust weekly log

This week:

📡 RustPulse
First OpenTelemetry + Jaeger integration in place (phase 1/3).
https://github.com/VinEckSie/rustpulse

🔐 Sealed in Rust
New chapter on cryptographic hash functions: SHA-2, BLAKE3, and beyond.
https://vinecksie.github.io/sealed-in-rust/02-core-primitives/02-02-crypto-hashes.html

#Rust #SystemsProgramming #Observability #OpenTelemetry #Jaeger #Cryptography #BLAKE3

GitHub - VinEckSie/rustpulse: ⚡ Secure, offline-first telemetry engine in Rust — real-time metrics for edge devices via Axum, gRPC, and PostgreSQL.

⚡ Secure, offline-first telemetry engine in Rust — real-time metrics for edge devices via Axum, gRPC, and PostgreSQL. - VinEckSie/rustpulse

GitHub

Research topic of the day: Choosing a #hashfunction for 2030 and beyond - https://kerkour.com/fast-secure-hash-function-sha256-sha512-sha3-blake3

TLDR: #blake3 is cool & let's hope it picks up steam, #sha512 is the winner for integrating into current projects.

#research

If you're heading to #RustConf25 and want to chat about #iroh, #p2p, #blake3 or #QUIC, say hi to https://bsky.app/profile/b5.bsky.social. He'll be there!
b5 (@b5.bsky.social)

Worker crayon at n0.computer, we make iroh.computer

Bluesky Social

@glepage

...what I really cannot comprehend is why there is no #BEP to migrate #BitTorrent to #CDC and #BLAKE3, and I guess I can't (won't) ask because last time I checked signing up to their mailing list required a #Google account

Today I got to know that #nix 2.27 added #blake3 support and proper submodules and lfs support to #flakes! 🎉
Other than supporting legacy systems or implementing cryptographic password functions that *should* be slow, why aren't your teams using GNU or BSD tar with #zstd compression and #xxHash or #Blake3 checksums yet? If time is money, please stop wasting either with legacy algorithms when there are faster, more secure, and more trusted options currently available.

For #BLAKE3

$ brew install b3sum

then

$ find . -type f -exec b3sum {} + > checksums.txt

the syncing means that it uses (I think) a different distribution protocol than classic IPFS, but it's still content-addressable. (For instance it uses #blake3 hashing so you can incrementally check the integrity of big files) https://github.com/n0-computer/iroh

Content addressable storage is simple and elegant right up to the point when you need to support multiple hash functions and/or digest sizes. After that, it becomes a much more difficult beast to tame.

But tame that fucking beast I will, goddamnit!

I'm really trying to learn from Git's difficulty in migrating to something other than sha1:
https://git-scm.com/docs/hash-function-transition/

One thing I'm doing differently is supporting multiple hash functions and digest sizes from the Git-go (hehe).

#Git #Rust #BLAKE3

Git - hash-function-transition Documentation