An update on stopping AI scrapers from DDOSing rbxlegacy.wiki because I forgot to earlier:
I got close but could never properly figure out Anubis, nor did I need to.
Instead I followed the advice of @Xkeeper 's blog post on how she protected TCRF: going through the IP logs and blocking harmful stuff. So I used Cloudflare's Security Rules to block some ASN's, user agents, I also set a few countries to show a Cloudflare challenge screen.
A few days later, my VPS turns out to still sometimes freeze with 100% CPU!
Turns out, one IP has been responsible for 30,000 requests in a single day when I went to check, and 2 other IPs also have thousands themselves. For static documentation that has been mostly the same since 2012.
They had the user agent of Scrapy/2.11.2 (+https://scrapy.org)
and came from Google Cloud Platform, ASN 396982. I didn't block the ASN but I blocked the specific IPs and the user agent and so far I haven't noticed any more downtime.
Now Cloudflare is blocking about half the amount of daily requests, I suppose 99% of the rest are still bots but they don't seem to do as much inadvertent DDOSing.
The blog post for anyone interested: https://blog.xkeeper.net/uncategorized/tcrf-has-been-getting-ddosed/