Mastodawn

"Due to potential legal incompatibilities between the CDDL and GPL, despite both being OSI-approved free software licenses which comply with DFSG, ZFS development is not supported by the Linux kernel"

Show thread

✧✦Catherine✦✧

@mcc been that way for decades now

Show thread

Jernej Simončič �Mar 1

@whitequark @mcc Wasn't that even the whole point of CDDL?

Show thread

Matthew Miller Mar 1

@jernej__s @whitequark @mcc

Yes, 100%.

And 100% fixable — by, um, Oracle.

So.

Show thread

mcc Mar 1

@mattdm @jernej__s @whitequark so all i need is for the AI bubble to pop and the WB deal to go bad, and things will be ok?

Show thread

dr 🛠️🛰️📡🎧

Mar 1

@mcc @mattdm @jernej__s @[email protected] Thus reducing to a previously-had problem

Show thread

mcc Mar 1

@jernej__s @whitequark My experience with Sun Microsystems suggests yes

Show thread

mcc Mar 1

@whitequark I have a new hard drive I intend to use primarily for backup and I am currently considering BTRFS or ZFS for the Linux part instead of ext4 (because I hear they can do some thing of storing extra error-checking data to protect against physical disk corruption). In your view, if I intend to use mainline Debian indefinitely, will BTRFS, ZFS, both, or neither give me the least pain getting things working?

Show thread

mcc Mar 1

@whitequark A few people are commenting on BTRFS reliability problems which is weird because I thought the whole point was to be "the more reliable fs". Debian's wiki links to this bewildering compatibility table that looks like a bunch of stuff I don't care about (the only features I care about are reliability, and some of zfs's auto-backup stuff sounded compelling) but the weird "mostly ok" line around defragmentation/autodefragmentation worries me a little https://btrfs.readthedocs.io/en/stable/Status.html

Status — BTRFS documentation

Show thread

Jernej Simončič �Mar 1

@mcc @whitequark I've been using btrfs for years without any problems, however I also never used the known to be problematic RAID5/6 support.

Show thread

Nogweii S Mar 1

@mcc @whitequark I've been running btrfs on servers for years now, no filesystem bugs so far. (One issue had arisen around power being cut leading to some data corruption but that wasn't btrfs' fault)

Show thread

mcc Mar 1

@nogweii @whitequark i thought the entire point of a journaling fs was that cutting power doesn't lead to data corruption (unless the corruption was at the app level I suppose)

Show thread

Ayla Mar 1

@mcc @nogweii @whitequark it doesn't *if the hardware upholds its end of the bargain*. no fs can protect against hardware that does not fulfill the guarantees its supposed to provide, and the only corruption I've had in btrfs was indeed due to faulty hardware. Btrfs has self-validation features so when faulty hardware breaks things btrfs is noisier about it than many fses, and that leads to a perception that it is worse when its just better at knowing what's broken.

Show thread

✧✦Catherine✦✧Mar 1

@ayla @mcc @nogweii this

Show thread

Hugo Mills Mar 1

@mcc @nogweii @whitequark btrfs isn't a journalling FS -- it's copy-on-write, which is subtly different.

The problem with unexpected power-off is when the hardware lies. btrfs requires that when the disk says that data's hit permanent storage, it really has. In some cases of buggy firmware, disks can pass a write barrier while the data's still only in cache. With a power-fail, that can lead to metadata corruption, because the FS has updated the superblock, pointing to an incomplete transaction.

Show thread

mcc Mar 1

@darkling @nogweii @whitequark I see. But it seems like that would be no greater a problem for BTRFS than ext4.

"it's copy-on-write, which is subtly different"

Does it have different performance characteristics? Intuitively it seems like it must but I can't really justify the idea it does moreso than modern journaling/autodefrag

Show thread

Hugo Mills Mar 1

@mcc @nogweii @whitequark I don't know about performance. I can describe the algorithm.

Due to the way it works (the copy-on-write part), a lost write is going to effectively drop an entire page of metadata, rather than simply not updating an existing page. It *never* writes updated data in place, except for the superblocks, which have fixed locations. So the damage in the missed-write case is rather larger than with non-CoW FSes

Show thread

b Mar 1

@mcc @whitequark i have a friend who despises btrfs and has had so many problems with it but i've been using it for years and it's always worked pretty well so idk? i think it can be a bit janky and confusing sometimes but imo a lot of that is just because it's so different from traditional filesystems
the one thing i can think of is that i've recently been getting some qgroup related warnings on my server that i'm not sure the cause of, i ran the command the message recommended to address the issue last night so we will see if it makes the messages go away

Show thread

Hugo Mills Mar 1

@mcc @whitequark You don't want to defrag something that's been snapshotted (because it breaks the reflink copy, and you end up with ~twice the data usage). It's stable; it just has this unexpected side-effect that many people don't know about until they try it.

Show thread

you're all idiots Mar 1

@mcc @whitequark I've worn out* I think three sets of HGST disks on btrfs over the years and have had little problems.

* from old age in SMART data

Show thread

you're all idiots Mar 1

@mcc @whitequark No data loss except once I ended up having a directory that would crash the os if accessed. I had been migrating file systems with DD onto new disks since the first usable versions of btrfs. Eventually on the next set of disks (maybe 8 years ago) I did a clean mkfs and never had any problems since.

Show thread

you're all idiots Mar 1

@mcc So it's stable enough but it does have some issues with database files which don't do well with the btrfs data model, leading to massive fragmentation caused by the random writes.
This is not an issue if you're copying files whole like for a backup.

Show thread

you're all idiots Mar 1

@mcc for a single backup disk you would probably be better off with xfs or ext4 simply because there's no need for the btrfs special features.
If it ever comes to data recovery then these two will be better known, and easier on the recovery process.

Show thread

you're all idiots Mar 1

@mcc except if you're backing up a btrfs to btrfs. It then becomes possible to make snapshots on the master and then stream those to the backup, recreating snapshots there.

Show thread

keithzg Mar 1

@whitequark @mcc Honestly the only times I have had problems with BTRFS have been when I have done deeply silly things, and even then it's always been recoverable. Been running some big storage pools of slow-spinning rust for years and no trouble. And I have *terrible* luck!

Show thread

Ryan Castellucci (they/them)

Mar 1

@mcc @whitequark I have been burned by btrfs multiple times, and ZFS has been fantastic. 🤷

Show thread

Chris Petrilli Mar 1

@mcc @whitequark just my take but I consider ZFS aimed at arrays and such. Single drive I’m just not sure you’re going to get any benefit and it might actually be substantial worse.

Show thread

mcc Mar 1

@petrillic @whitequark do you think there is an advantage of BTRFS over ext4 for a single drive, single computer, non RAID, my sole/primary goal is "i want it to last as long in a room-temperature drawer as possible"?

Show thread

Chris Petrilli Mar 1

@mcc @whitequark I think this is a scenario where external influences are critical. I would use ext4. Mostly because its quirks are well known and if I had to recover it, there’s tons of resources all the way down to physical recovery companies.

BTRFS I think is missing all that infrastructure.

Show thread

you're all idiots Mar 1

@mcc @petrillic @whitequark btrfs will allow you multiple copies of your data and metadata on a single disk.

It might protect against some disk issues, but probably not that many. SSDs will just stop working altogether on controller or dram failure and lose all of the disk at once.

I hope you are aware that SSDs are not recommended to be kept unpowered - the 10 year data retention relies on scrubbing that happens only when the power is on.

Show thread

Daniel Lakeland Mar 1

@mcc @petrillic @whitequark

There are a couple things you can do here... One is BTRFS has checksums so it will *detect* when the data has rotted in the drawer, whereas ext4 doesn't.

Also, BTRFS you can set the mode of data storage to DUP and you'll get TWO copies of every data block (at the expense of being able to store about half the stuff), BTRFS can then do a scrub and detect corrupted blocks and fix them from the good copy.

Finally, you can do compression, snapshots, and sends

Show thread

Daniel Lakeland Mar 1

@mcc @petrillic @whitequark

snapshots are good for keeping history of things, and send is good for offsite backup.

Oh, and you can do deduplication, which might let you store more stuff?

I have NEVER lost a btrfs drive to anything but hardware failure, I've been using it since about 2012 or something.

Show thread

mcc Mar 1

@dlakelan @petrillic @whitequark it kinda seems like btrfs has all the same features of zfs, and people like zfs more, but i don't see a lot of reasons other than "vibes" or "super fancy code techniques that matter in high end situations i don't hit"

Show thread

Daniel Lakeland Mar 1

@mcc @petrillic @whitequark

I think this is a fair high level view. Another things about zfs is the license and such makes integrating it into a "normal" desktop system or whatever a pain in the ass. For example you can't just add a package in Debian.

I 100% suggest you format your single backup drive as btrfs, set DUP for data if you have a big enough drive, and mount it with compress=zstd unless you're storing highly compressed data already.

Show thread

mcc Mar 1

@dlakelan it sounds like debian has some kind of "spooky" system now where every time it updates the kernel it silently in the background spends a few minutes compiling a kernel so it can integrate the zfs module?

Show thread

Daniel Lakeland Mar 1

@mcc

oh it looks like it does now... dkms the debian kernel module system or something similar has been around for a long time, but zfs support is I think relatively new (say last 5 years?)

if you want to use DKMS stuff make sure you install the linux-headers for your linux-kernel package !

Show thread

Zimmie Mar 1

@petrillic @mcc @whitequark I don’t personally use Btrfs right now, so can’t comment on it directly. I know it’s similar to ZFS in this regard, but I don’t know the details of how it differs.

General performance of ZFS is really impressive, even on single drives, mostly due to the concept of async writes. It buffers a bunch of async writes in RAM as a “transaction group”, then flushes them all in a mostly-sequential write. The state of the filesystem is always consistent, though applications may lose a few seconds of data if the system is rebooted before a transaction group flushes.

Show thread

abrasive Mar 2

@petrillic @mcc I am only n=1 but will mention I use ZFS on all my single-drive systems (mostly laptops) and have zero complaints with performance. I appreciate being able to back up entire filesystems to my NAS (also ZFS) with checksums, snapshots, encryption etc. intact.

My biggest frustration is the lack of rebalancing support, specifically on pools big enough I can't copy everything off. Having to install separate kernel modules is only a mild irritation for me though, YMMV

Show thread

✧✦Catherine✦✧Mar 1

@mcc btrfs

my headmate, who is obsessive over data integrity, runs btrfs on her NAS with zero issues. it has nice things like snapshotting and such. the reputation btrfs has dates back to many years ago and i don't think the issues people distrust it for have mattered for quite a while

Show thread

Hugo Mills Mar 1

@whitequark @mcc Seconded. I've been using since it landed in the kernel in 2.6.29, and I've had two broken filesystems in all that -- one shortly after I started using it, and one as a result of my own cock-up replacing a failed disk.

Show thread

✧✦Catherine✦✧Mar 1

@mcc none of what i said applies to btrfs raid 5/6 because we have zero experience with that in particular

Show thread

val Mar 1

@whitequark @mcc https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#man-btrfs5-raid56-status indeed says btrfs's raid56 should not be used

btrfs(5) — BTRFS documentation

Show thread

Hugo Mills Mar 1

@whitequark @mcc That's reliable, as long as you don't want it to actually handle a broken disk. (So, not actually useful for the precise case that you need it for). I'd recommend steering clear of parity RAID until those issues are fixed. But don't hold your breeath.

Show thread

Glyph Mar 2

@darkling @whitequark @mcc FWIW I have handled multiple broken disks on a synology with SHR, which I know is btrfs and I think is raid5 underneath.

Show thread

Hugo Mills Mar 2

@glyph @whitequark @mcc Sort of...

Symbian carry a *lot* of out-of-tree patches to btrfs. I believe they did *something* to integrate MD-RAID with btrfs, and Synology's "btrfs" isn't entirely compatible with mainline any more.

Show thread

Glyph Mar 2

@darkling @whitequark @mcc yeah, I was wrong about that. it’s btrfs *over* mdraid, which is a weird choice. in practice, it works very well, but I guess I am in big trouble if I ever want to migrate to a new storage solution

Show thread

mcc Mar 2

@glyph @darkling @whitequark mdraid sounds like if you tried to say "android" and "mermaid" at the same time. as you can see, i have nothing to add to this conversation

Show thread

Hugo Mills Mar 2

@glyph @whitequark @mcc I think it's not even one over the other in the expected layers. They're actually integrated together, somehow. The standard btrfs tools don't work properly on Synology devices.

Show thread

Clifton Royston Mar 1

@mcc @whitequark

AIUI, ZFS really requires multiple drives to be effective.

You might gain a little value from extra checksums on file system blocks on a single drive, but if those checksums ever start failing on a hard drive there is a high likelihood that most of the drive is about to fail completely.

I had researched ZFS a fair bit as I planned to build my own FreeBSD NAS around 3-4 drives in ZFS, but eventually decided to buy an off-the-shelf ZFS NAS from the TrueNAS people.

Show thread

mcc Mar 1

@CliftonR ok. is it accurate zfs can be snapshotted and restored more efficiently (in terms of on-disk cost) than ext4?somebody also said something about btrfs allowing zstd compression (for some of the disk? for all of the disk?)

Show thread

ROTOPE~1 ⭐️Mar 1

@mcc in btrfs - every file can have different compression, if you're crazy enough. What I do is set compression on the root folder of a new fs, and let that be inherited everywhere.

btrfs property set . compression zstd:8 ; chattr +c .

Show thread

Clifton Royston Mar 1

@mcc

I think those are both true in general though I don't know ext4 well enough to compare in depth.

1) One of the fundamental ideas of ZFS is Copy-On-Write. This makes it function similarly to a VCS, in that this makes snapshots nearly free. It sets a checkpoint where from now until you release the snapshot, your new present state of the file system stores only the changed blocks.

2) ZFS supports several compression algorithms all of which (including the default) work very well.
+

Show thread

Clifton Royston Mar 1

@mcc

3) ZFS also has built-in "ZFS send" and "ZFS receive" functions for copying an entire ZFS filesystem to new media of similar or different drive layout, on the same system or over a network.

I've got limited experience with those, but it seems to me like they work well.

Show thread

Clifton Royston Mar 1

@mcc

Oh, forgot to say about the compression:

2.A.) I always think of compressing and decompressing as slowing things down. The reverse seems to be true - ZFS benchmarks I've looked at say that having strong compression integrated in the FS actually *speeds up* the file system, because it saves more than enough disk writes/reads to make up for the CPU overhead.

It also can do auto deduplication if you like - more useful fall-out of the COW mechanism - but that's a bit too freaky for me.

Show thread

Clifton Royston Mar 1

@mcc

The other thing about ZFS that's a bit hard to explain, and frankly I don't know well enough to know if I'm explaining it right, is that it seems to integrate much more detailed knowledge of physical drives than most file systems.

It talks to SCSI or SATA at a very low level, uses SMART data from HDDs, does slow background "scrubbing" of the drives over time, to force the drive to see & reallocate sectors starting to fail, etc.

I don't know all the details, but it seems like good stuff.

Show thread

mcc Mar 1

@CliftonR "It talks to SCSI or SATA at a very low level"

Imagine I plugged a SATA drive into a USB3 enclosure. Should I assume this will not happen the way ZFS hopes?

Show thread

Clifton Royston Mar 1

@mcc

Ya, I was wondering that myself as I wrote it. It's another damn good question.

The answer is I really don't know how much it may affect that, or to what extent it can see "through" the USB3/SATA converter. If Google still worked properly it would be easier to find out.

Show thread

Clifton Royston Mar 1

@mcc

I earlier mentioned Michael Lewis @mwl as a ZFS expert (which he is) and he seems like a nice guy, and you know, the good kind of tech weirdo.

So I am, with minor hesitation, tagging him in now to correct any misinformation I may be spreading about ZFS.

He might also find your base question interesting, what kind of file system is best to put on a single standalone drive being used as a system or data backup.

I've never seen that discussed much, though it's a great question to ask.

Single disk system? Set copies=2 for error correction.

ZFS snapshots are the most efficient of any filesystem thanks to copy-on-write.

ZFS is fine for backing up, but error correction applies to the data it gets. Send garbage, you'll have high-integrity garbage.

Compression trades CPU cycles for disk I/O. Most hosts today have more CPU than IOPS, so it's a fair trade.

Show thread

mcc Mar 1

@mwl @CliftonR "Single disk system? Set copies=2 for error correction."

Is this an option with brtfs or is it a zfs-only concept?