When two Hetzner servers died at the same time

On May 12, 2026, two of my Arch Linux + LUKS servers at Hetzner became unreachable at the same moment. Both had been running for 4+ months without issue. Both had received the same pacman -Syyu the day before, but had stayed on the old kernel until the morning the websites stopped responding. I rebooted — SSH never came back. nmap -Pn -p 22 showed filtered from anywhere. No ping. No banner. The Hetzner Robot panel insisted the hardware was fine.

Several hours went into hypotheses that turned out to be wrong:

  • The encryptssh initcpio hook referencing a /usr/lib/initcpio/udev/11-dm-initramfs.rules file that no longer exists. Real bug, no boot impact — the initramfs rebuilds anyway.
  • PermitRootLogin no in sshd_config. Real misconfiguration, fixed it, didn’t help. A refusing sshd shows closed, not filtered.
  • Predictable interface-naming drift after the systemd 260 upgrade. Patched the .network config to match by MAC. Useful hardening; not the cause.
  • Stale GRUB stage1 + core.img in the MBR. Arch never re-runs grub-install after a grub package upgrade. Refreshed it. Still filtered.
  • Kernel 7.0.5 regression. Downgraded to 6.18.3, the kernel that had run for 4 months. Still filtered. So the kernel itself wasn’t it either.

The clue was in the persistent journal: a single recorded boot from December 31 to May 12 10:13 UTC, and absolutely nothing after. Every reboot since the upgrade was failing before systemd-journald could flush to disk — so the failure had to be in the initramfs, before the root filesystem was even mounted.

What it almost certainly was

Hetzner Dedicated servers configure the initramfs network with ip=dhcp on the kernel command line. That depends on Hetzner’s DHCP server replying to whatever request format the current kernel sends. Somewhere between kernel 6.18 / iproute2 6.18 and kernel 7.0 / iproute2 7.0, the request format changed enough that Hetzner’s DHCP stopped responding. Effects:

  • Old kernel at runtime kept the interface already configured (Phase A — 32 hours of healthy operation after the package upgrade).
  • New kernel cold-boots, hits DHCP, never gets an IP, dropbear cannot listen, port 22 stays filtered.

Hetzner’s own documentation has been quietly moving away from ip=dhcp toward static IPv4 in the kernel command line. The fix is exactly that:

GRUB_CMDLINE_LINUX="cryptdevice=/dev/md1:cryptroot ip=A.B.C.D::GATEWAY:255.255.255.255:hostname:eth0:none"

One line in /etc/default/grub, grub-mkconfig, reboot. No more dependency on Hetzner’s DHCP responding to whatever your current kernel sends.

Why it matters for anyone running this stack

If you run Arch on Hetzner Dedicated with full-disk encryption and remote unlock via dropbear, the ip=dhcp shipped by installimage is a latent bug. It can keep working for years and then break overnight, on every machine you have, after a routine pacman -Syyu. The static-IP version is what Hetzner now recommends and removes the entire dependency.

Tooling

While debugging, I turned the whole rescue / chroot / diagnose / fix workflow into a Python CLI (hal) — including hal fix static-ip, which derives the static cmdline directly from your existing systemd-networkd .network file:

github.com/kevinveenbirkenbach/hetzner-arch-luks

Single command, idempotent, reversible (the original /etc/default/grub is backed up to .hal-backup). If you’re on this stack, switch to static IP before the next kernel upgrade catches you.

#ArchLinux #bootFailure #debugging #DevOps #DHCP #Dropbear #fullDiskEncryption #GRUB #Hetzner #initramfs #kernelUpgrade #Linux #LUKS #mkinitcpio #pacman #postmortem #PythonCLI #serverOutage #sysadmin #systemdNetworkd
GitHub - kevinveenbirkenbach/hetzner-arch-luks: Guide to install Arch Linux with LUKS encryption on an hetzner server

Guide to install Arch Linux with LUKS encryption on an hetzner server - kevinveenbirkenbach/hetzner-arch-luks

GitHub
Anthropic officially declared Claude down across its iOS app, desktop platform, and API today. Downdetector registered over 7,000 immediate complaints as engineers failed to patch a massive login system crash. #Claude #Anthropic #TechNews #AI #ServerOutage
https://blazetrends.com/anthropic-confirms-massive-claude-outage-why-millions-are-locked-out-of-the-ai-today/?fsp_sid=974
Anthropic confirms massive Claude outage: Why millions are locked out of the AI today

A sudden wave of new users fleeing ChatGPT broke Anthropic's server infrastructure today. Following massive public backlash over OpenAI's recent data-sharing

Blaze Trends
Rainbow Six Siege outage alert! Servers went down today for maintenance, leaving players worldwide unable to connect. Check live updates and fixes now
https://techyquantum.com/rainbow-six-siege-outage-feb-2026/
#RainbowSixSiege #ServerOutage #GamingNews
Edmontonian Social is back up 🎉 And I learned a lesson or two during this adventure 🤦‍♂️
#EdmontonianSocial #ServerOutage #ServerUpgrade #GoneWrong

Reddit faced a massive global outage, blocking browsing and login for thousands and sparking frustration as communication lagged. Ongoing stability issues raise big concerns for the platform’s future.

#RedditOutage #RedditDown #TechIssues #PlatformCrash #ServerOutage #SocialMediaNews #TECHi

Read Full Article Here :- https://www.techi.com/is-reddit-down-today-global-outage/

Today was "huge global outage on the internet!"

lol, this had 0 effect on my day. I was like: "huh, signal is out" lol. And continued my day. Maybe because my entertaiment and work setups run locally 😁

Don't trust only on cloud. Its all vapor after all 😉

#GlobalOutages #serveroutage #cloudservices #piracy

#Linux #Update #KDENeon #Ubuntu #ServerOutage

Moin.
Gestern abend war mal wieder "Technik die begeistert" Moment... Nicht begeistert. Ubuntu hat (ob er heute morgen immer noch besteht, weiß ich erst wenn ich es nochmal versucht habe) einen Serverausfall, über den einen Teil der Updates von KDE Neon laufen, weil das ja Ubuntu-Unterbau ist. Erst mal die Bug-Reports lesen müssen, um das rauszufinden. Warum kann man bei einem so gravierenden Problem nicht eine zentrale News dazu raushauen? 😤🫤
Dazu zusätzlich hatte die Telekom gestern auch ein Internetproblem hier in meiner Region.
Ich habe gestern mal wieder fluchend vor meinem Laptop gesessen, bis ich raus gefunden hatte, dass nicht meine Geräte an dem Chaos schuld sind... Mmpf.

Nachtrag: Eben versucht, das Update ist immer noch nicht möglich, weil der Ubuntu-Server immer noch nicht erreichbar ist. Na toll. Naja, dann halt heute nicht mehr. Werd da mal in ein paar Tagen rein schauen. 🤷‍♀️

Due to a recent power fluctuation in my house, the server hosting Edmontonian Social didn't power off and turn back on gracefully. Therefore, Edmontonian Social was down for a few hours. I just recently realized that Edmontonian Social was down and I should investigate 😅

Now that we are done with this unscheduled social media detox, I am happy to say that all systems are back up. I am glad that nothing exciting happened and we didn't need to roll back to a recent backup.

Few technical information:
- I didn't rush to turn it back on myself, since the server can self turn on when AC power recovers. I recently realized that it didn't come back up by itself, probably because that could cause some damage on storage devices in case of power fluctuation.
- Due to how Fediverse works, there was a surge of inbox activity for first few seconds the server came back up. It was a heavy load for such a short time that it is almost adorable.

#ServerOutage
Just a heads up #Joomla peeps. Many Joomla official sites are down with Rochen having the hiccups this morning. We hope to be back ASAP.
#ServerOutage #Sorry #Glitch
Does anyone know what happened to swisstoots.ch? They seem to be down right now, and have been for over a week, but I haven't been able to find any information for this person who's been asking about it: it's like they just walked away and let the server fail without saying anything...
#SwissToots #outage #ServerOutage #Mastodon #meta