Today in "It was DNS" news.
So connectivity external to my Kubernetes cluster wasn't working and I couldn't figure out why.
So some thing in a random pod would try to resolve www.example.com and it'd get the IP of my external connection.
Pause here if you want to figure this out yourself.
I have dynamic DNS set up with a wildcard address so anything.my.domain goes to my.domain.
I also use my.domain as the root of everything internal to my network, so if I have some-service.my.domain set to point to some internal IP, I can use some basic reverse proxying to allow HTTP access externally.
However the place where those internal names are registered has changed, instead of being on my Samba AD cluster, it's now on the router as I've deleted the Samba AD cluster as it isn't and won't do anything useful for me.
However the router doesn't think it's the authoritative source of my.domain DNS entries, so it forwards them externally, so if I resolve nonexistent-host.my.domain, it gets passed upstream, resolved by the wildcard, and ends up with the IP of my external connection.
However this was happening for nearly any domain inside Kubernetes, not just obviously incorrect ones.
Why? Because Kubernetes sets "option ndots:5" in every pod's resolv.conf, and adds my.domain to the end of the search list, so any sufficiently short name is resolved as short.name.my.domain before it is resolved as short.name.
This obviously caused a lot of problems as short.name.my.domain always resolved to an IP.
I fixed this by blocklisting my.domain in the router, it turns out that Unbound on OpnSense resolved names it knows about before applying blocklists, so this works as expected without having to convince the router that it owns the domain.
Sigh.
At least things are working now.
#kubernetes #itwasdns #dns #CursedHomelab