Mastodawn

which is "better", XFS or BTRFS? i only care about file system resilience on shitty SSDs, so like, which one is going to be slower at letting my photos be randomly destroyed by bit flips? #linux

Show thread

aeva 13h ago

i like that they're both b-tree file systems, it's very silly that linux has two of these

Show thread

aeva 12h ago

i do not find it encouraging that people can't seem to agree on what the basic features of the different standard linux file systems are and are not ;_;

Show thread

aeva 12h ago

ok i think the main things i've learned tonight is that filesystem checksums only catch when something is corrupted it can be restored from a duplicate automatically or notify you so you can restore it from a backup, and SSDs are cursed hell storage that (depending on the vendor) may undermine the file system's duplication efforts and wear leveling techniques, so depending on the hardware you have the decision of what filesystem might literally not matter at all in this respect?

Show thread

Asta [AMP]11h ago

@aeva yeah, it turns out that it's still pretty hard to beat spinning rust, tape... possibly optical, but we don't really have good data on that (plus not as common as it once was)

there's a reason I run a big spinning rust ZFS array : (

Show thread

Asta [AMP]11h ago

@aeva ZFS comes with its own pain in the ass stuff since it will never be upstreamed, but it's also a very good file system for running RAID like configs with redundancy. On my own single disk systems I just run btrfs for convenience (subvolumes, yay) and basically back up all my stuff to the ZFS array... which I wanna do tape backups but I haven't gotten around to it yet. Especially for my photos...

Show thread

aeva 11h ago

@aud why will zfs never be upstreamed

Show thread

aeva 11h ago

@aud that seems like a red flag

Show thread

Asta [AMP]11h ago

@aeva tl;dr: license incompatibility and disagreement over whether it contains code lifted from... I think Solaris

which, not allowing a driver into the kernel because "they can't guarantee where the code came from"...
...
...

Show thread

Asta [AMP]11h ago

@aeva oh, maybe I overcomplicated it: https://en.wikipedia.org/wiki/OpenZFS#Linux

just an incompatible license

OpenZFS - Wikipedia

Show thread

the vessel of morganna 11h ago

@aud @aeva it's more than just the license, it will never be merged due to fundamental design incompatibilities with Linux (as per Linus).
instead of rewriting ZFS to integrate with Linux subsystems (block/page cache, volume management/device mapping, encryption, etc) it reimplements the Solaris APIs with a compatibility layer. as you can probably imagine this impedance mismatch leads to numerous quirks and bugs, as well as makes upgrading the kernel a hassle especially if there's an ABI change

Show thread

max*ine 11h ago

@aeva @aud because Oracle deliberately licensed it in a way to be incompatible with GPLv2

Show thread

ROTOPE~1 ⭐️11h ago

@aeva on the plus side, knowing that you have corrupted data means that you can do something about it. Like, not propogate it into your backup cycle. And start eyeing "what the fuck is fucking my fucking data here?"

Show thread

aeva 11h ago

@rotopenguin that is a good point

Show thread

muddle 10h ago

@aeva @rotopenguin I'm not sure how zfs handles snapshots (assuming it does) but with btrfs it's pretty easy to combine snapshots with backups (btrfs send and btrfs receive) so that if there is undetected corruption at some point and you're using snapshots/diffs on the backups you should be able to bisect (or examine diffs) to find where the corrupt backup was made and then to recover from it. I'm not sure if there are any tools out there that help with that but at least it is possible and easy enough to figure out if you do find yourself in that situation.

Show thread

Fabian Giesen 12h ago

@aeva they had two jobs:
1. files
2. _be systematic_ (underlined in red, "see me after class" margin note)

Show thread

Fabian Giesen 12h ago

@aeva oh well. I guess we'll always have files tho

Show thread

"Musty Bits" McGee 13h ago

@aeva we tried c-trees but they segfaulted

Show thread

ROTOPE~1 ⭐️11h ago

@arichtman @aeva on a c-tree filesystem, YOU are responsible for not reading beyond the end of a file.

Show thread

Glyph 12h ago

@aeva it has four if you count the external ones!

Show thread

Carl C 5h ago

@aeva It's silly that ext4 is still the dominant Debian file system. XFS is demonstrably better at handling large disks (the journaling takes up less CPU, for one), and the admin utilities are just a lot nicer to work with.

But I gave in and just run ext4 on my personal systems because it's the default, and at my scale the performance differences aren't significant.

Show thread

Josh Simmons 13h ago

@aeva it won't make any difference, neither has any special resilience mechanisms (especially not for data at rest).

@dotstdy well butts

@aeva @dotstdy not true for btrfs? btrfs dup can be used for a raid configuration or on a single disk if you want, although obviously that won't help you if the whole device goes bad

Show thread

Josh Simmons 12h ago

@glyph @aeva I mean you can set up lots of wild configurations yea. I just mean without having some form of parity or raid they're not going to be any different in themselves (and you can do raid with either one). And you can potentially just store parity data with the files you want to archive if you just want a single disk option.

Show thread

Glyph 12h ago

@dotstdy @aeva is there a tool for maintaining and scrubbing parity data? if I wanted this I would definitely have a scheduled "btrfs scrub" on a DUP=2 volume as just … the most ergonomic way to get parity data that I didn't have to manually babysit

Show thread

Josh Simmons 12h ago

@glyph @aeva well it depends on the use case, if it's data at rest you can potentially manually generate parity data (ye old par archives anyone?) without any FS level features at all. But if you want to have an automatic process for protecting stuff then it might be more involved yeah.

Show thread

Glyph 12h ago

@dotstdy @aeva I've never heard of a 'par archive' so no I don't know :).

I mean I could easily write code to generate parity data, and even to apply it; but btrfs will automatically generate it, verify it, apply parity back to the original data and move it to a different physical sector on the disk when doing that application, which I'm sure I could figure out in a months-long engineering project, but, why? :)

Show thread

aeva 12h ago

@dotstdy @glyph is there a way to make the parity data just on by default for several specific folders?

Show thread

Glyph 12h ago

@aeva @dotstdy this gets a bit outside my experience with btrfs, but maaaybe you can do this with subvolumes?

Show thread

Glyph 12h ago

@aeva @dotstdy okay no, nevermind, subvolumes are a red herring for this. looks like it's all or nothing at the filesystem level :(. and apparently SSDs can do some real Shenanigans in their controllers here https://btrfs.readthedocs.io/en/latest/mkfs.btrfs.html#man-mkfs-dup-profiles-on-a-single-device

mkfs.btrfs(8) — BTRFS documentation

Show thread

aeva 12h ago

@glyph @dotstdy fascinating!

Show thread

Josh Simmons 11h ago

@aeva @glyph yeah the fun thing with ssds is they are constantly doing error detection and recovery because of how close to the wire the actual storage is running. its all failing all the time

Show thread

aeva 12h ago

@glyph @dotstdy having destroyed a good many SSDs in the last ten years I've noticed that you tend to get random recoverable failures with gradually increasing frequency well before the drive gives up the ghost, and also I've noticed that my png files on SSDs tend to slowly but gradually start to randomly develop bands of wrong as bit flips and RLE do funny things to images

Show thread

Oblomov 12h ago

@aeva @dotstdy IIRC ZFS is basically the only option if you want that kind of integrity built in

Show thread

>>>>>>>13h ago

@aeva filesystems are kind of an emotional topic i feel like, temper your expectations of results accordingly.

my personal opinion: ext4 and xfs do not store checksums of file data, only metadata. btrfs and zfs store checksums of the file data itself too. so btrfs and zfs will detect bit flips in the file data, while ext4 and xfs will not. because of this i prefer btrfs and zfs

Show thread

aeva 12h ago

@artemis I'm hearing conflicting reports on the checksumming thing

Show thread

Glyph 12h ago

@aeva @artemis official documentation of the feature for btrfs is here, btw: https://btrfs.readthedocs.io/en/latest/Auto-repair.html

Auto-repair on read — BTRFS documentation

Show thread

>>>>>>>12h ago

@aeva I'm poking around the xfs code to see if i can find anything conclusive, but unfortunately filesystems are big so no guarantees

Show thread

>>>>>>>11h ago

@aeva The define for the CRC feature bit says

#define XFS_FEAT_CRC (1ULL << 13) /* metadata CRCs */

I've found a lot of CRCing code and it does seem to be conditional on that feature, but none of it seems to directly contradict this comment saying it's specifically metadata. It has the thing I was afraid of though where there's so much bookkeeping code that it's hard to find the actual path a file write takes and what's happening to the data of the write vs the metadata around the write. Tapping out for now.

Show thread

Glyph 13h ago

@aeva xfs doesn't address this problem. if you want it with xfs, you need dm-integrity which is a separate layer. although with btrfs you still need to opt in to DUP=2 and halve your capacity, if you want bit-flip protection on a single disk https://unix.stackexchange.com/questions/741149/for-btrfs-how-to-convert-to-dup-3-or-3-on-a-single-partition

For Btrfs, how to convert to DUP =3 or >3 on a single partition?

The question refers to: Debian based Linux, kernel >=5.1 explicitly also to systems with only one single harddisk or SSD, with only one Btrfs partition Hints on a possible implementation: &quo...

Unix & Linux Stack Exchange

Show thread

Miss Aemilia🎀13h ago

@aeva in my long experience, btrfs is extremely prone to corrupting itself and shitting the bed in unstable conditions - this is emphatically not the case with XFS.

Show thread

Miss Aemilia🎀12h ago

@aeva this is not how it should be, btw. On paper btrfs should absolutely be more resilient but it was designed to tick corporate checkboxes, and the implementation was congealed around that environment. The on-disk format is just a mess and did not have proper diligence put into it.

Show thread

wlo 12h ago

@MissAemilia @aeva my experience has been that both tend to fairly easily corrupt in unstable conditions. there's probably a better choice entirely than the two if reliability is the only desire.

Show thread

Miss Aemilia🎀12h ago

@wizard @aeva
100%. The most resilient FS I've ever used, and I've *really* put it through its paces was bcachefs. Which I'm sadly moving off of because of Kents genuine devolution into LLM psychosis.
But its self healing properties were like no other FS I've ever seen, and its not even close. The on-disk resiliency is a fucking marvel. Recovery times arent necessarily quick but it *does* recover.
Pity it got fucking ruined.

Show thread

cancel 12h ago

@aeva every time I've seen an SSD fail, it's been the controller burning/dying, not the NAND, which makes the entire drive inaccessible.

Show thread

cancel 12h ago

@aeva if you have an SSD that has random bit flips i'd be really interested to know more lol

Show thread

aeva 12h ago

@cancel I've chewn through quite a few at work, but I've also found that png files occasionally get corrupted on SSDs, and as an SSD ages it gradually starts producing recoverable file system errors until it they become numerous enough that i have to take the disk offline. switching over to a different disk as the main one and keeping the old one for reference when it starts to get bad seems to be somewhat effective at slowing down the rate of failures on it

Show thread

cancel 12h ago

@aeva wow that’s interesting. is it always the same brand?

@cancel samsung

@cancel my laptop has a western digital "black" SSD and that has been fine so far

Show thread

aeva 12h ago

@cancel also it is worth noting that the majority of SSDs I've destroyed were samsung SSDs that were manufactured in 2019 or early 2020. seems like something happened around then that caused a dip in quality control

Show thread

cancel 11h ago

@aeva that’s weird. I have several from that time period and they’re all fine.

Show thread

aeva 11h ago

@cancel heat them up and see what happens :D

Show thread

cancel 11h ago

@aeva …I think 😬 I know some of them fail to flush to NAND if you kill the power too fast

Show thread

aeva 11h ago

@cancel I'm impressed they're still alive, but I really do recommend making backups of them if you haven't already

Show thread

cancel 11h ago

@aeva yeah they’re all backed up. Also I have done hashing to see if stuff has changed at rest and never seen anything wrong. My really old 840 (SATA) had catastrophic read errors after a few years due to a firmware bug that was eventually patched.

Show thread

cancel 11h ago

@aeva or was it 640? I can’t remember the generation

Show thread

cancel 11h ago

@aeva also I have one of those absurdly large PC cases with a lot of fans. CPU never goes above 55c. (It’s an OG Threadripper so it would shut down if it went above 65 lol)

Show thread

aeva 11h ago

@cancel what IT told me was we've had a lot of failures with the drives we bought around then, and it may be that their longevity sharply decreases with poor cooling, which would be consistent with the machine I've been murdering them with

Show thread

cancel 11h ago

@aeva how… how hot is the inside of your pc