I have a large local #musiclibrary—mostly #flac with some #m4a. I sync it between my #nas and four computers using #rsync.
Unfortunately, I recently discovered that some files appear to be corrupted, even though I've always checked the rsync logs. 🫢

I'll need to scan the entire library now to see how widespread the damage is. I first noticed something was wrong when a few tracks started skipping mid-song while playing via #mpd / #mpc.

I guess nothing lasts forever—not even music files. 😞

@amadeus I'm always nervous about losing files. But it has always been that way. Cassette tapes got tangled. Vinyl scratched. CDs stopped playing. Files get corrupted. 🫤

@amadeus Good luck! I hope the damage is minimal.

Unfortunately, both HDDs and SSDs are prone to spontaneous small failures, sometimes known as Unrecoverable Read Errors. Tape is famously reliable - and unaffordably expensive.

The only prevention I know of is a good RAID system on your NAS, which is... well, it's less expensive than tape, anyway. If you're up for the challenge (and have the budget) I recommend RAIDZ.

@KatS Thanks! 🤞🙏 My NAS is actually an SHR RAID. I'm not sure if that protects files from becoming corrupted. I'm also unsure if files that became corrupted on the computers also became corrupted on the NAS via rsync.

@amadeus Interesting. I hadn't heard of SHR before, but it looks like a proprietary adaptive system, where the degree of redundancy depends on how many disks are in the array. Also looks like it'll go up to RAID6 or some equivalent approach, if you give it enough disks.

Unfortunately, if the copy on the NAS is intact, and you then rsync a corrupted copy over it from your workstation, now you have two copies of the corrupted version.
The only reliable defenses against that, that I know of, are:

  • generational/versioned backups (something more sophisticated than rsync)*
  • a filesystem that uses checksums for integrity-checking, and redundant copies for recovery.

RAID6 is good for resilience - it's basically RAID5 with an extra parity disk. ZFS has that plus the checksums. Of course, I also have a 6-disk array, because everything comes at a cost :)

*This isn't a swipe at you using rsync. I use it myself, and really need to switch to something better.

@KatS Thank you very much for the valuable tips! 🙌
I'm afraid that's exactly what happened—intact files were replaced by corrupt ones. Well, we never stop learning, right? 🫣😅 And there's probably never a situation where there isn't something that could be optimized.