I now have a multi-tiered approach to blocking AI bots on my infrastructure:

1) robots.txt - Ha, they don't fucking care.
2) iocaine -> https://iocaine.madhouse-project.org/ (poisons the bot with never ending HTTP content)
3) HTTP 426 for any HTTP/1* requests (tells legit browsers to upgrade to HTTP/2+)
4) Anubis -> https://anubis.techaro.lol/ (requires javascript proof-of-work)
5) Injecting kill strings as HTTP headers

Next layer is going to be prompt injection attacks into every resource served via comments in all the documents.

This is war.

#fuck_ai #fuck_with_ai #ai

iocaine - the deadliest poison known to AI

@reyjrar 6) Block offenders on a network/ASN level.
@simondassow @reyjrar Sadly, won't work, as they've been coming from an insane amount of domestic IPs. There's probably some malware selling infected machines as proxies or similar.
@ainmosni @reyjrar ASNs limit this to selected offenders. But of course, when legitimate users hide in those, they'll be cut off as well.
@simondassow @reyjrar You got it backwards though, legitimate users aren’t hiding in there, the scrapers are. They’ve been hiding amongst legitimate users for quite a while now, which is why we need things like anubis to begin with.  If they used their own, predicable IP ranges, we could just block those or the ASNs they belonged to.
@ainmosni @reyjrar Even so, it's still a valid form of punishment addressed at the source I think. They slop, we stop.
@simondassow @reyjrar While I don't disagree, but if it also blocks a more than significant amount of legitimate innocent users, it becomes more complicated.