hey so here's a cool fun new thing to add to your threat model

something that polls your UPS to measure voltage, is somehow, inadvertently causing the network management card in it to puke, and when that thing reboots or crashes, it takes the UPS down with it (it reboots also) - and everything hung off that ups .. loses power. which includes your dns server.

so fucktardian windows machines and android devices that think the network is down if they cant resolve dns all disconnect from the lan

@Viss

crap like this was why, when i was the architect for a large ISPs DNS, i mandated that all DNS servers were clusters, were anycasted internally, all had dual power supplies, the power supplies were on different, UPS backed power A/B feeds per machine.

so many things shit the bed when they couldn't resolve something in DNS. it was easier just to grossly overbuild the DNS infra than try to get that many vendors to fix that many broken things.

@paul_ipv6 @Viss the architect in our ISP did this too, as well as having an air raid siren in the office when DNS resolution was flakey in the network

It didn't help that we were still updating the bind.conf manually with vi over ssh. We also had a small enough fleet that we know the IP blocks by heart

@webhat @paul_ipv6 so my problem was that the firewall wasnt setup correctly here. it was SUPPOSED to be acting as a caching/forwarding setup (fuckin unbound on pfsense) - but it wasnt. the idea was that if the main dns server (adguard) went offline for a reboot (or the ups shit itself) the firewall would have a cache for a while and dns wouldnt go offline

but nooooo

@Viss @webhat

most FWs still use dnsmasq as their DNS and don't really let you fully configure it. they expose a small subset to the GUI.

what you want is a DNS server with a small set of hard coded zones for local stuff so that you don't ever actually go to root and down the tree to resolve things inside the firewall.

it's stunningly hard to find any commercial kit that just does that.