When two Hetzner servers died at the same time

On May 12, 2026, two of my Arch Linux + LUKS servers at Hetzner became unreachable at the same moment. Both had been running for 4+ months without issue. Both had received the same pacman -Syyu the day before, but had stayed on the old kernel until the morning the websites stopped responding. I rebooted — SSH never came back. nmap -Pn -p 22 showed filtered from anywhere. No ping. No banner. The Hetzner Robot panel insisted the hardware was fine.

Several hours went into hypotheses that turned out to be wrong:

  • The encryptssh initcpio hook referencing a /usr/lib/initcpio/udev/11-dm-initramfs.rules file that no longer exists. Real bug, no boot impact — the initramfs rebuilds anyway.
  • PermitRootLogin no in sshd_config. Real misconfiguration, fixed it, didn’t help. A refusing sshd shows closed, not filtered.
  • Predictable interface-naming drift after the systemd 260 upgrade. Patched the .network config to match by MAC. Useful hardening; not the cause.
  • Stale GRUB stage1 + core.img in the MBR. Arch never re-runs grub-install after a grub package upgrade. Refreshed it. Still filtered.
  • Kernel 7.0.5 regression. Downgraded to 6.18.3, the kernel that had run for 4 months. Still filtered. So the kernel itself wasn’t it either.

The clue was in the persistent journal: a single recorded boot from December 31 to May 12 10:13 UTC, and absolutely nothing after. Every reboot since the upgrade was failing before systemd-journald could flush to disk — so the failure had to be in the initramfs, before the root filesystem was even mounted.

What it almost certainly was

Hetzner Dedicated servers configure the initramfs network with ip=dhcp on the kernel command line. That depends on Hetzner’s DHCP server replying to whatever request format the current kernel sends. Somewhere between kernel 6.18 / iproute2 6.18 and kernel 7.0 / iproute2 7.0, the request format changed enough that Hetzner’s DHCP stopped responding. Effects:

  • Old kernel at runtime kept the interface already configured (Phase A — 32 hours of healthy operation after the package upgrade).
  • New kernel cold-boots, hits DHCP, never gets an IP, dropbear cannot listen, port 22 stays filtered.

Hetzner’s own documentation has been quietly moving away from ip=dhcp toward static IPv4 in the kernel command line. The fix is exactly that:

GRUB_CMDLINE_LINUX="cryptdevice=/dev/md1:cryptroot ip=A.B.C.D::GATEWAY:255.255.255.255:hostname:eth0:none"

One line in /etc/default/grub, grub-mkconfig, reboot. No more dependency on Hetzner’s DHCP responding to whatever your current kernel sends.

Why it matters for anyone running this stack

If you run Arch on Hetzner Dedicated with full-disk encryption and remote unlock via dropbear, the ip=dhcp shipped by installimage is a latent bug. It can keep working for years and then break overnight, on every machine you have, after a routine pacman -Syyu. The static-IP version is what Hetzner now recommends and removes the entire dependency.

Tooling

While debugging, I turned the whole rescue / chroot / diagnose / fix workflow into a Python CLI (hal) — including hal fix static-ip, which derives the static cmdline directly from your existing systemd-networkd .network file:

github.com/kevinveenbirkenbach/hetzner-arch-luks

Single command, idempotent, reversible (the original /etc/default/grub is backed up to .hal-backup). If you’re on this stack, switch to static IP before the next kernel upgrade catches you.

#ArchLinux #bootFailure #debugging #DevOps #DHCP #Dropbear #fullDiskEncryption #GRUB #Hetzner #initramfs #kernelUpgrade #Linux #LUKS #mkinitcpio #pacman #postmortem #PythonCLI #serverOutage #sysadmin #systemdNetworkd
GitHub - kevinveenbirkenbach/hetzner-arch-luks: Guide to install Arch Linux with LUKS encryption on an hetzner server

Guide to install Arch Linux with LUKS encryption on an hetzner server - kevinveenbirkenbach/hetzner-arch-luks

GitHub

Another day, another IPv6 question. I'm on a Hetzner cloud VM. I have a static IPv6 /64 subnet / prefix¹. The system uses systemd-networkd for network management.

My eth0 does not seem to receive any RA at all but I'd like to delegate the prefix downstream to a bridge interface br0 on the same host. I have an IPv6Prefix section on eth0 with the Prefix=…, Assign=yes and Token=static:::1. It works, eth0 gets the ::1 address for this prefix.

Is there a way for br0 to get this prefix too without eth0 joining the bridge? And to announce the prefix to any interfaces joining the bridge (e.g. LX system containers)?

¹ Is this the same? I have the impression these terms are used interchangeably with IPv6.

#ipv6 #systemd #networkd #systemdNetworkd

Как устроен CDN RUTUBE: железо, сеть, ПО

Привет, Хабр! Меня зовут Дмитрий Иванов, я начальник отдела эксплуатации IT-инфраструктуры RUTUBE, что на наши деньги переводится как SRE-тимлид. В этой статье разберу задачу доставки контента и расскажу и решениях, которые помогают нам в RUTUBE. Дано: с одной стороны у нас 17,7 млн ежедневных пользователей, а с другой — 400 млн единиц контента. Оба эти показателя постоянно увеличиваются, а география присутствия пользователей расширяется. Требуется: показывать всем нашим пользователям видео из библиотеки быстро, надежно и эффективно.

https://habr.com/ru/companies/oleg-bunin/articles/919360/

#cdn #rutube #кэширование #балансировка #видео #numa #psi #anycast #geoip #systemdnetworkd

Как устроен CDN RUTUBE: железо, сеть, ПО

Привет, Хабр! Меня зовут Дмитрий Иванов, я начальник отдела эксплуатации IT-инфраструктуры RUTUBE, что на наши деньги переводится как SRE-тимлид. В этой статье разберу задачу доставки контента и...

Хабр

For those who specialize in DHCPv6 and systemd: Is there a way to tell the DHCPv6 server "If this IP is available, just give me it, don't give me anything else", or at least get systemd to do that? I'm trying to make an oracle cloud instance running Arch+systemd-networkd that uses DHCPv6 for IP configuration only use one of two IPs assigned to the oracle instance, but leave the other one unused so I can do NDP proxying and route it to my laptop over wireguard, giving my laptop a public IPv6 address as a result, but it appears that oracle is forcing my VPS to use both IPv6 addresses, which is not what I want.
Redacted logs, for context:

Jun 18 06:08:27 somewhere systemd-networkd[-1]: eth0: DHCPv6 address 2000::4201/128 (valid for 1d 5
9min 59s, preferred for 23h 59min 59s)
Jun 18 06:08:27 somewhere systemd-networkd[-1]: eth0: DHCPv6 address 2000::1337/128 (valid for 1d 5
9min 59s, preferred for 23h 59min 59s)

Feel free to boost this for increased visibility if you wish, and if you know of any mailing lists or IRC channels I should ask on, please let me know.
Relevant tags to try to help people who might know something see this:
#dhcp #ipv6 #systemd #oracle #dhcpv6 #networking #systemdnetworkd #systemd-networkd

Netplan | Canonical Netplan

Backend-agnostic network configuration in YAML.

Прозрачное туннелирование трафика с маршрутизацией на основе геолокации IP-адресов

В этой статье попробую рассказать как в домашней сети создать еще один шлюз по умолчанию и настроить на нем на выборочную маршрутизацию на основе списка подсетей. Используя в качестве такого списка базу данных геолокации IP-адресов, можно перенаправлять трафик в зависимости от страны назначения.

https://habr.com/ru/articles/854112/

#vpn #iptables #iproute2 #ipset #systemdnetworkd #маршрутизация

Прозрачное туннелирование трафика с маршрутизацией на основе геолокации IP-адресов

В этой статье попробую рассказать как в домашней сети создать еще один шлюз по умолчанию и настроить на нем на выборочную маршрутизацию на основе списка подсетей. Используя в качестве такого списка...

Хабр
Netplan | Canonical Netplan

Backend-agnostic network configuration in YAML.

Why is #Debian still using something archaic like the #ifupdown scripts. There are modern alternatives like #systemdnetworkd, #NetworkManager or #Netplan
I'd much rather want to see Debian using one of these instead as default.