Mastodawn

I now have a multi-tiered approach to blocking AI bots on my infrastructure:

1) robots.txt - Ha, they don't fucking care.
2) iocaine -> https://iocaine.madhouse-project.org/ (poisons the bot with never ending HTTP content)
3) HTTP 426 for any HTTP/1* requests (tells legit browsers to upgrade to HTTP/2+)
4) Anubis -> https://anubis.techaro.lol/ (requires javascript proof-of-work)
5) Injecting kill strings as HTTP headers

Next layer is going to be prompt injection attacks into every resource served via comments in all the documents.

This is war.

#fuck_ai #fuck_with_ai #ai

iocaine - the deadliest poison known to AI

Show thread

Simon Dassow Mar 15

@reyjrar 6) Block offenders on a network/ASN level.

Show thread

Daniël Franke

@simondassow @reyjrar Sadly, won't work, as they've been coming from an insane amount of domestic IPs. There's probably some malware selling infected machines as proxies or similar.

Show thread

Simon Dassow Mar 15

@ainmosni @reyjrar ASNs limit this to selected offenders. But of course, when legitimate users hide in those, they'll be cut off as well.

Show thread

Daniël Franke

Mar 15

@simondassow @reyjrar You got it backwards though, legitimate users aren’t hiding in there, the scrapers are. They’ve been hiding amongst legitimate users for quite a while now, which is why we need things like anubis to begin with. If they used their own, predicable IP ranges, we could just block those or the ASNs they belonged to.

Show thread

Simon Dassow Mar 15

@ainmosni @reyjrar Even so, it's still a valid form of punishment addressed at the source I think. They slop, we stop.

Show thread

Daniël Franke

Mar 15

@simondassow @reyjrar While I don't disagree, but if it also blocks a more than significant amount of legitimate innocent users, it becomes more complicated.