"Due to potential legal incompatibilities between the CDDL and GPL, despite both being OSI-approved free software licenses which comply with DFSG, ZFS development is not supported by the Linux kernel"
@mcc been that way for decades now
@whitequark I have a new hard drive I intend to use primarily for backup and I am currently considering BTRFS or ZFS for the Linux part instead of ext4 (because I hear they can do some thing of storing extra error-checking data to protect against physical disk corruption). In your view, if I intend to use mainline Debian indefinitely, will BTRFS, ZFS, both, or neither give me the least pain getting things working?

@mcc @whitequark

AIUI, ZFS really requires multiple drives to be effective.

You might gain a little value from extra checksums on file system blocks on a single drive, but if those checksums ever start failing on a hard drive there is a high likelihood that most of the drive is about to fail completely.

I had researched ZFS a fair bit as I planned to build my own FreeBSD NAS around 3-4 drives in ZFS, but eventually decided to buy an off-the-shelf ZFS NAS from the TrueNAS people.

@CliftonR ok. is it accurate zfs can be snapshotted and restored more efficiently (in terms of on-disk cost) than ext4?somebody also said something about btrfs allowing zstd compression (for some of the disk? for all of the disk?)

@mcc

I think those are both true in general though I don't know ext4 well enough to compare in depth.

1) One of the fundamental ideas of ZFS is Copy-On-Write. This makes it function similarly to a VCS, in that this makes snapshots nearly free. It sets a checkpoint where from now until you release the snapshot, your new present state of the file system stores only the changed blocks.

2) ZFS supports several compression algorithms all of which (including the default) work very well.
+

@mcc

3) ZFS also has built-in "ZFS send" and "ZFS receive" functions for copying an entire ZFS filesystem to new media of similar or different drive layout, on the same system or over a network.

I've got limited experience with those, but it seems to me like they work well.

@mcc

Oh, forgot to say about the compression:

2.A.) I always think of compressing and decompressing as slowing things down. The reverse seems to be true - ZFS benchmarks I've looked at say that having strong compression integrated in the FS actually *speeds up* the file system, because it saves more than enough disk writes/reads to make up for the CPU overhead.

It also can do auto deduplication if you like - more useful fall-out of the COW mechanism - but that's a bit too freaky for me.

@CliftonR @mcc For ZFS deduplication, $WORK is in the frankly uncommon territory where dedupe actually makes business sense for us, despite my repeated past attempts to figure out how to ditch it. My non-expert advice from this experience is "don't".

No data loss from it, I'm not worried about that, but you periodically hit edge cases (deadlocks, etc) that disappear entirely when dedup is not used.

ZFS team fix them as they see them, but it's time consuming, and in most use cases not worth it.

@fwaggle @mcc

Very good to know, TY!

I guess my gut feelings about tech choices continue to be well-trained, for the most part.

@CliftonR @mcc Yep! In my very limited experience, dedupe and L2ARC are ZFS' siren songs. They lure you in because on the surface they seem like they'll solve a bunch of problems for you, but they'll actually cause more problems than they solve for *most* people.

@fwaggle @mcc

I did set up a small SSD for L2ARC which doesn’t *seem* to have caused me any problems - that I can tell - but thanks for the advice. It’s a pretty lightly loaded system with relatively little of the main pool used so far, and that may be a factor.

I’ll keep that in mind, especially if I see any issues in future. Thanks again.