Oh, this is #fun.

#Applebot - Apple's web crawler, used for various things - is ignoring robots.txt rules governing crawling of websites.

I have Applebot (and Applebot-Extended, which isn't really a crawler) in my robots.txt files, set to disallow all access. Has been that way for #yonks.

And Applebot is consistently the highest-traffic crawler to my sites - at least of ones that actually bother to fetch robots.txt. Yesterday, for example, Applebot fetched robots.txt from one of my websites almost 800 times.

Yes, it's really Apple, not someone faking the user-agent identifier. It's coming from the networks that Apple says can be used to identify Applebot access. DNS matches, everything.
e.g. https://support.apple.com/en-ca/119829

So: legendary Apple software quality. Documented to do the right thing, but actually doing the wrong thing. And completely failing to cache content, fetching the same file 800 times a day when it hasn't changed in years.

Hey, Apple! Need a software engineer who's actually, you know, good at it? I'm available.

#Apple #AppleInc #TimApple #WebCrawler #RobotsTxt #quality #WeveHeardOfIt #qwality #AppleQwality #legendary #TwoHardThings #caching #fail #engineer #software #SoftwareEngineer

About Applebot - Apple Support (CA)

Learn about Applebot, the web crawler for Apple.

Apple Support

Does anyone know if the `zm` in parent post is present in "base" (or something else fairly normal)? It turns out, using something like that is probably the "correct" approach. Changing the types of `Algebra` or `Ran` don't really make sense.

If not, what the hell should I name it? #TwoHardThings

今天 reorg Q&A 说 最难的事情就是给新组取名字(就 八杆子打不着的组 塞到一起 能不难嘛) 但是大家千万不要因为换了个名字就缓存失效了 该干啥还是干啥 #TwoHardThings
Nice supplementary to a classic meme.

#twohardthings https://social.heldscal.la/attachment/767347