Someone emailed me today asking for resources to learn how linux filesystems (like ext4, ext3, etc) work under the hood. Have any of you seen anything good out there?
@b0rk https://e2fsprogs.sourceforge.net/ext2intro.html Remy Card who worked on ext2 wrote this way back when
Design and Implementation of the Second Extended Filesystem

@laurentoget this looks great, thanks! earlier versions of things are often simpler to understand so I like the idea of starting with ext2
@b0rk in my experience there are really good guides to how each of them works but not a particularly great unified source for all of them.

@b0rk I've mostly cobbled together resources on IRC when asked, relating more to specific technical issues. Wikipedia, after some diving, does have the essentials.

Key point: ext and even FAT, are "catalog-based filesystems" using "superblock" tables or trees to track the actual location of data on disk. ext and friends have a long history: https://en.wikipedia.org/wiki/Unix_File_System

For a ton of more technical detail on FAT, which isn't TOO dissimilar: https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system

Unix File System - Wikipedia

@b0rk If you look at the specs closely, you can also see that the ext family took a similar route as Apple did with HFS, in implementing additional features using existing features as a base over time. The filesystem journal being a magical file in the root of the filesystem, for example.

On the HFS side of things: inotify-like change queue? SQLite database of changes. (+ Mach KQueue, but.) TimeMachine needs to be able to buffer the changes since last backup. See also: "hard" link support.

@b0rk And the single most curious filesystem problem I have ever run into: because there's generally a fixed allocation of space for the superblock (and its backup copies), you can run out of disk space while having all the bytes free in the world.

# df -i

You can run out of inodes.

This is why all of my hard-linked backup drives since the 90's (faubackup, later rsync; Time Machine before Time Machine) used RieserFS. No fixed tables.

@b0rk Not for the specifics of ext3 and ext4, but for the principles underlying file systems in general, definitely check Chapters 40 and thereabouts of "Operating Systems: Three Easy Pieces"[1].

[1]: https://pages.cs.wisc.edu/~remzi/OSTEP/

Operating Systems: Three Easy Pieces

@raph @b0rk This is what I came to suggest also.
An introduction to Linux's EXT4 filesystem

Take a walk through EXT4's history, features, and optimal use, and learn how it differs from previous iterations of the EXT filesystem.

@b0rk It's been ages but I remember fondly reading through https://github.com/tpn/pdfs/blob/master/Practical%20File%20System%20Design%20-%20The%20Be%20Filesystem.pdf when I was younger. I think a lot of the specific tricks in BeFS ended up going out of fashion, but a lot of the content is still good.
pdfs/Practical File System Design - The Be Filesystem.pdf at master · tpn/pdfs

Technically-oriented PDF Collection (Papers, Specs, Decks, Manuals, etc) - pdfs/Practical File System Design - The Be Filesystem.pdf at master · tpn/pdfs

GitHub
@b0rk even though it’s not Linux, “the design and implementation of the 4.4bsd operating system” has a good introduction to FS internals. Robert Love’s “Linux Kernel Development” also has a great intro to the VFS layer, block storage architecture, and much more

@b0rk This one's a bit of a deep dive, but it takes a look, across 5 parts, at XFS, from the superblock to the disk layout, etc.

https://righteousit.wordpress.com/2018/05/21/xfs-part-1-superblock/

XFS (Part 1) – The Superblock

Righteous IT
SANS Digital Forensics and Incident Response Blog | Understanding EXT4 (Part 1): Extents | SANS Institute

SANS Digital Forensics and Incident Response Blog blog pertaining to Understanding EXT4 (Part 1): Extents

@adam820 @b0rk There are actually six parts-- I just wrote Part 6 more than a year after the first five. 😉

https://righteousit.wordpress.com/tag/xfs/

XFS – Righteous IT

Posts about XFS written by Hal Pomeranz

Righteous IT

@hal_pomeranz @b0rk Ha, I thought I remembered there being six parts, but I was on my phone at the mall today and only followed it up to five!

Amazing write-up!

@b0rk A bit further back, but it's a great paper: Design and Implementation of the Second Extended Filesystem (ext2) http://web.mit.edu/tytso/www/linux/ext2intro.html
Design and Implementation of the Second Extended Filesystem

@b0rk Can the Linux operating system be used to reduce racial discrimination on university campsuses?
@b0rk the book Practical Filesystem Design by Dominic Giampaolo documents bfs, used in BeOS. The author later designed APFS for Apple. So, not exactly Linux, but the ideas are not very different

@b0rk "Practical File System Design with the Be File System." is the best book i've seen on filesystems

there's a pdf linked on wiki, https://web.archive.org/web/20170213221835/http://www.nobius.org/~dbg/practical-file-system-design.pdf

Wayback Machine

@b0rk Rémy card wrote a book for ext2
@b0rk I think @mwl is about to release “OpenBSD Mastery: Filesystems”. If it’s like his other books, it’ll be a great option for digging into UFS. https://www.tiltedwindmillpress.com/product/omf-print-ebook-bundle/
“OpenBSD Mastery: Filesystems” print and ebook bundle – ENDS 1 DECEMBER 2022

Sale ends 1 December! OpenBSD Mastery: Filesystems will start to escape at the end of this year. For a short time, you can preorder print and ebook directly from me. You can also add a second Maste…

Tilted Windmill Press