Your logging is probably down
Let’s tinker around and accidentally break something.
and debug it until you have to reinstall your entire stack from scarch
GET OUT OF MY HOUSE!
Are you implying it’s possible to debug without having to reinstall from scratch? Preposterous! 😂
Guess this is a good time to test my infrastructure automation.
Scarched arth

“Damn, I’ve got this Debian server shit down. I wonder how an opensuse server would work out”

*installs tumbleweed*

True story

My man person!
When’s the last time you checked if your backup solution works?
Yesterday! Switched my media server from freebsd to alpine and got the arr stack all set up using the backup zip files
Backup? Psh… That’s what the lab is for.
But if my backups actually work then I miss out on the joy of rebuilding everything from scratch and explaining to my wife why non of the lights in the house work anymore.
Carry around a candle in one of those old timey holders like Scrooge Mcduck
What’s a backup solution…? (I’m only being half sarcastic, I really need to set one up, but it’s not as “fun” as the rest of my homelab, open to suggestions)
No mercy for you, then. ;)
I at least have external backups for important family pics and docs! But yea the homelab itself is severely lacking. If it dies, I get to start from scratch. Been gambling for years that “I’ll get around to a backup solution before it dies”. I wouldn’t bet on me :|

logging is probably down

You do, of course have a dedicated rsyslogd server? An isolated system to which logs are sent, so that if someone compromises another one of your systems, they can’t wipe traces of that compromise from those systems?

Oh. You don’t. Well, that’s okay. Not every lab can be complete. That Raspberry Pi over there in the corner isn’t actually doing anything, but it’s probably happy where it is. You know, being off, not doing anything.

Rsyslog - Wikipedia

Ah. The approach that [email protected] suggested. ;)

Thanks for the tutorial though.

Your logging is probably down - Leminal Space

Lemmy

Hmmm. My pi{VPN,hole,dhcp,HA} has a little bit of overhead left…

All of your systems are set up, but are they capable of being redeployed using a configuration management software package? Ansible or something like that?

Oh. They’re not. Well, that’s probably okay. I mean, you could probably go manually reproduce configurations, more or less.

Software configuration management - Wikipedia

You have an intrusion detection system set up, right? A server watching your network’s traffic, looking for signs that systems on your network have been compromised, and to warn you? Snort or something like that?

Oh. You don’t. Well, that’s probably okay. I mean, probably nothing on your network has been compromised. And probably nothing in the future will be.

Intrusion detection system - Wikipedia

Barring any hardware issues or external factors, will it run for 10000 years? Any logs not properly rotated? And other outputs accumulating and eventually filling up a filesystem?
Buy a UPS and setup a NUT server on the spare raspberry pi you have lying around.

All of those systems in your homelab…they aren’t all pulling down their updates multiple times over your network link, right? You’re making use of a network-wide cache? For Debian-family systems, something like Apt-Cacher NG?

Oh. You’re not. Well, that’s probably okay. I mean, not everyone can have their environment optimized to minimize network traffic.

Apt-Cacher NG - Software Package Download Proxy

I set this up years ago, but then decided it was better to just install different distros on each of my computers. Problem solved?
You can forgejo with a container index enabled, I don’t know if there’s a way to use that as a proxy for downloading containers though.

You have squid or some other forward http proxy set up to share a cache among all the devices on your network set up to access the Web, to minimize duplicate traffic?

And you have a shared caching DNS server set up locally, something like BIND?

Oh. You don’t. Well, that’s probably okay. I mean, it probably doesn’t matter that your devices are pulling duplicate copies of data down. Not everyone can have a network that minimizes latency and avoids inefficiency across devices.

Squid (software) - Wikipedia

That won’t work in most cases, all https traffic isn’t cached unless you mitm https which is a bad idea and not worth it.

Only cache updates those are worth it and most have a caching server option.

Then it turns out your monitoring system failed and FUCK IT’S BEEN A MONTH SINCE THE LAST PROPER BACKUP
Hearbeat notifications man. “Yes I am online” email once a day or so. Yeah it’s more emails to delete but it can be a lifesaver.
but you probably won’t notice that some of the regular emails are not sent anymore
Couple it to your smart watch, backup every 10 seconds, and make it vibrate when successful
you are just making yourself learn to ignore that your smartwatch vibrates. It’s a bit like breathing and blinking, you are so used to it you can completely forget that its happening. if your smartwatch, or phone, or whatever, starts vibrating all the time, you will get used to it and not notice when it stops happening anymore, but also it will hide any actually meaningful notification.

Oh but I have them !
Every day an email is sent out with the backup status.
Every day I got my email in the morning with the back up logs.
For years.
I associated email received to backup successful, until a month or so when my vpn broke and the emails where just “could not connect”, but it took me a while to bother actually opening the message body as it had always been the same for years.

So I’ll manage it differently, have the email subject be more explicit about a success or a failure amongst other things.
Always learning :^)

Do your backups work?

Have you tested your backups recently? Having them complete is one thing, having the data you need for recovery is another. Have you backed up your vm configurations and build scripts?

Go test your latest backup!

Restore is future me’s problem. Fuck that guy :D
Ah, that frission of excitement when you come to restore! Will it work? Does it contain that very important file? Is it up to date? How much will future you hate past you if it isn’t there?

You have remote power management set up for the systems in your homelab, right? A server set up that you can reach to power-cycle other servers, so that if they wedge in some unusable state and you can’t be physically there, you can still reboot them? A managed/smart PDU or something like that? Something like one of these guys?

Oh. You don’t. Well, that’s probably okay. I mean, nothing will probably go wrong and render a device in need of being forcibly rebooted when you’re physically away from home.

Power distribution unit - Wikipedia

Does a $12 Shelly plug count?
if you can cycle your home assistant with the shelly plug whilst your home assistant is down, yes. from experience it’s really quite annoying to have a smart plug switch off HA…
HA is on the same proxmox host as the router. So yeah I can end up locked out. Hasn’t happened yet tho! The relay is on my test machine, it’s always nvidia that crashes there.

An 8 switch relay, old Pi, and 8 hardware store outlets can be had for not much more. I did that and let PiKVM control my outlets directly.

This is the back of my 10" rack before it was cleaned up. Lots of custom work on this that I’ll be posting a page on my site about when complete.

@[email protected] in case you are interested

The Shelly can be configured to automatically turn back on after a certain amount of time. It has local scripting capabilities.

If they did that… I don’t know.

If you do have the smart PSU and power management server you probably also went down the rabbit hole of scripting the power cycling, right? Maybe made that server hardened against power loss disk corruption so it can be run until UPS battery exhaustion.

What if there is a power outage and NUT shuts everything down? Would be nice to have everything brought back up in an orderly way when power returns. Without manual intervention. But keeping you informed via logging and push notifications.

Oh. You don’t. Well, that’s probably okay. I mean, nothing will probably go wrong and render a device in need of being forcibly rebooted when you’re physically away from home.

*furiously adds a new item to the TODO list*

Tal just got the chaotic evil tag today.
I built an 8 outlet version of those with relays and wall outlets for… a lot less.
You should use Arch, then you can update every 15 minutes 🤭
I havent messed much with my servers in 2 years. I think that means I’ll hit my RIO in another 5 :)
Have you tried introducing unnecessary complexity?
If you know how your setup works, then that’s a great time for another project that breaks everything.

Saturday morning: “Incus and podman seem interesting. I bet I could swap everything over while the family is out this afternoon”

Sunday evening: “Dad, when will the lights work again?”

“Dad, when will the lights work again?

As soon as selinux decides I have permission.

The old lighting wasn’t that great anyway. If I were to just put lighting on a DMX512-controlled network, then all of it could be synchronized to whole-house audio…
DMX512 - Wikipedia

Don’t forget to integrate it into Home Assistant so you can alert the ISS when the mail man is on the porch.
Infrastructure diagram? No! In this homelab we refer to the infrastructure hyperdodecahedron.

It seems like a good time to learn graphviz’s dot format for the network layout diagrams, with automated layout.

mamchenkov.net/…/graphviz-dot-erds-network-diagra…

Using Graphviz dot for ERDs, network diagrams and more - Leonid Mamchenkov

I've mentioned Graphviz many a time on this blog. It's simple to use, yet very powerful. The dot language is something that can be jotted down by hand in the simplest of all text editors, or generated programmatically. The official website features a gallery, which demonstrates a wide range of graphs. But I still wanted

Leonid Mamchenkov - Life, universe, and everything else