1/6
I’ve been thinking through disaster recovery with #NixOS #flakes, impermanence, and #ZFS.
A flake defines a system declaratively, but assumes storage (e.g. ZFS pools) and persisted data (e.g. /persist) already exist. Flakes don’t create storage—they only configure it. Right?
2/6
So the real workflow is:
Bootstrap (ISO or nix-infect)
Create storage (manually/script)
Apply flake
System activates and redirects state into /persist
3/6
Key tension:
NixOS generations = reproducible system state
/persist = survives outside that model
Flake = versioned (Git commit)
So a working system is actually a pair:
→ (flake commit, persist state)
4/6
Disaster recovery isn’t just “rebuild the flake”
It’s restoring a known-good pair
So we need to track:
→ flake revision ↔ persist state compatibility
5/6
Possible approach:
Version /persist (e.g. Git, selectively)
On each rebuild:
Record flake commit
Commit persist state
Maintain mapping: → flake commit ↔ persist commit
6/6
Recovery becomes:
Recreate storage
Restore /persist
Identify its version
Use matching flake commit
Rebuild
So the system = a coherent pairing of config + state
Is there a standard way people handle this in production?
Or is this something for snapshots.in ZFS ?

For each host, recovery must restore not just configuration, but a compatible pair:

(flake revision, persist snapshot)

The trusted recovery unit is therefore not “latest flake” alone, but a
known-compatible pairing of system definition and persisted state.

@adingbatponder you can use flakes to create storage with Disko: https://github.com/nix-community/disko

However, I've found it tends to make recovery harder since it tends towards needing you to wipe and repartition if things are out of whack

GitHub - nix-community/disko: Declarative disk partitioning and formatting using nix [maintainers=@Lassulus @Enzime @iFreilicht @Mic92 @phaer]

Declarative disk partitioning and formatting using nix [maintainers=@Lassulus @Enzime @iFreilicht @Mic92 @phaer] - nix-community/disko

GitHub