Unpopular opinion: I have fought with ZFS under Talos for months, but in reality what I needed was Longhorn.

Yeah, yeah, I now, different things. But that's just to say that ZFS is not the silver bullet that some people try to convince you of.

#HomeLab #TalosLinux #ZFS #Longhorn @homelab

Yeah, I'm plenty aware that using networked volumes with Kubernetes is the better way to go, but I gotta hand it to Longhorn: the distributed replicas make it a breeze to move stuff around and do physical maintenance in the nodes. ๐Ÿ‘

#HomeLab #TalosLinux #Longhorn @homelab

Lets build our own #cloud #onprem with #suse #harvester and #rancher and #longhorn ... Much to learn i still have ... This is just the beginning!
Updated my #homelab to #k3s 1.35, deployed #vaultwarden, #pihole with #unbound and #forgejo with #woodpecker. All the web guis have #letsencryt certificates. Storage redundancy is achieved by #longhorn which maintains redundant copies of pvcs across the nodes. The nodes are vms controlled by #incus, so that I can shut down the whole k8s thingy with a simple incus stop command when Iโ€™m done playing.
k3s is really nice and easy to deploy and well documented, getting things to run in k8s though is mainly pasting yaml from the internet and hoping for the best.

So it appears like, for the last ~months, my MTU configuration was REALLY wrong

Hinted by the immich longhorn replica not rebuilding, but I also didn't know, the extreme slowness of any service using a db cluster where the master node wasn't in the same region

I had put the slowness on the shoulders of packets hopping a lot between regions, but it turns out, it was just db requests maxing past the configured MTU value, silently dropping

Now that BOTH the wireguard and flannel MTU values are set properly, everything is so damn snappy

This feels like new skin

#homelab #selfhosted #selfhosting #wireguard #mtu #vpn #mesh #longhorn #immich #flannel #devops #linux #opensource #networking

For the last ~6 months, my immich Longhorn PVC wouldn't rebuild replicas across regions, and timeout instead

Today, I figured I had misplaced my MTU configuration for the Wireguard network under k3s...

So some packets were getting dropped silently...

Woops

#kubernetes #k3s #longhorn #network #networking #wireguard #wg #mesh #homelab #selfhosted #selfhosting #mtu

Using #django and #longhorn rwx volumes: when I run collect static and put static on an rwx volume it tries to write some 900 files in short succession and this kills the rwx volumes.
Also when creating folders via migrations the same happens. I get I/O errors.
I solved that with an RWX Django storage Class that allows for retries. One retry is enough and it works.

So am I doing it wrong?

PS: you could argue that static files should belong in an emptyDir. The underlying issue gives me headaches though

#kubernetes

Show scheduled on Sept. 11 has #SexPistols making explosive return to the #Longhorn Ballroom in #Dallas for first time since 1978, this time sans Johnny Rotten

https://www.dallasnews.com/arts-entertainment/music/2026/03/11/sex-pistols-return-to-longhorn-ballroom-dallas/

Sex Pistols to return to Dallas' Longhorn Ballroom in 2026

The refurbished Sex Pistols will make their long-awaited return to the Longhorn Ballroom in Dallas on Sept. 11.

The Dallas Morning News

I've been a little rough and irresponsible with my #baremetal #Kubernetes cluster, especially when it comes to randomly rebooting nodes. Today I fixed that.

I'm running a bunch of somewhat delicate workloads, including database clusters with CSIs like #Longhorn and #OpenEBS. Checking if everything is in working order has been demanding task and often something I've skipped before rebooting or upgrading nodes - occasionally with horrific results.

Last night I finally took the time and wrote a pretty thorough script that checks that everything is working and healthy, before politely cordoning off a node, draining it and applying upgrades.

I felt so confident today that I tested it by running this new safe upgrade script for all the nodes in the cluster - and it worked! All nodes are now fully upgraded and running kernel 6.12.73 on Debian 13.

This also fixes the outstanding issue caused by #Hetzner no longer supporting obtaining IP addresses through DHCP.

#Linux #MSTDNDK #K8s

I accidentally had a QLC drive (that's the really shit kind) in an nvme raid array hosting a Longhorn cluster. Could not understand why it would regularly shit itself for longer than I'd care to admit. It didn't help that Longhorn 1.11.0 has a memory leak and OOMs were triggering replica rebuilds on a weekly basis.

#selfthosted #k8s #longhorn