Been fighting off slopbots trying to bring my site down since last Wednesday and man if you thought I was pissed off at AI before, hoo boy, wow
Two o'clock in the god damn eh emm in the terminal watching logfiles scroll up the screen in my terminal, study looked like a bad hackers film, people are not supposed to do this, I should've been asleep, all throughout the next day the sun was shining outside and I should've been doing human shit like making some kinda little structure out of mud for birds while wearing a couple of leaves and instead I'm indoors trying to stop anthropic and facebook and google from bringing down my website, we were put here to be the gardeners of the world but instead I'm trying to outwit nonsense machines made by nonsense people who also should be rolling naked in the fresh spring grass

You know what, you know what the best part of all this is, I read this

https://xaselgio.net/posts/26.poisoning-knowledge/

and I went wow, that sounds like a pain in the ass, good luck with that.

...

I HAVE A WEBSITE

I literally read that whole thing and did I take any preemptive measures, did I have a trap set up for the bots, did I even have a .htaccess that would at least block some of them, did I bollocks

You know when you see nature documentaries and the lion takes down one of the antelopes and all its mates just stand around watching it get eaten like four feet away, going wow, that looks like a pain in the ass, good luck with that

THAT'S ME, THAT IS.

Oh and you know how like eighty percent of y'all have your own websites, hey guys the lion's eating me maybe you might wanna do something about it because it'll probably have you for dessert

Antelopes watching their mate get et and going "Wow, look at that," that's you that is

Poisoning the knowledge

A rant about current state of the internet (LLM crawlers), and some observations & conclusions, along with some techniques to help you protect your own services.

Indigo's den

See everyone going "Wow my websites keep going offline because of AI bots"

Me: oh no that's awful I'm so sorry that's happening to 🌠⭐🎢yooooouuuuuu and not meeeee,πŸŽ΅πŸ’–πŸ’‹

My own website goes offline because of AI bots: me doing surprised Pikachu face

Ignore the guy half inside the lion pointing at you and dooming "It'll happen to YOUUUU," it won't happen to you, you're special 😌

Me, just a head sticking out of the lion's mouth: 🦝 It happened to me WAY sooner than I thought it would!

You, eyes inches away watching me get et: 🐰 Yeah, I see that! Wow!

🦝 Like it happened REALLY suddenly, I wish I'd prepared

🐰 Yeah, that would've been smart, huh!

🦝 Have you done anything to get ready for this thing that's definitely gonna happen to you?

🐰 I'll get right on that!

🦝 *muffled voice from 🦁's belly* Yeah, like, soon!

🐰 It's on the list!

🦝 Like, now maybe!

🐰 Definitely before the end of the month! Wow, that's rough buddy

🦝 yeah wow

🦁 buuurrrrp

Nah what am I worried about, you're all smarter than me 😁

We're all the main character in our own stories but have you considered that there are many genres of story

Mine is apparently A Cautionary Tale

🐰
🦝
🐰 so um
🦝
🐰 hi
🦝 Yeah. Hi.
🐰
🐰 you'll never guess what happened to me
🦝 I bet I can
🐰 yeah
🦝
🐰
🐰 man, it SUCKS in here
🐰 you got any advice for me
🦝 that thing you were thinking of doing,
🦝 do it last week

@ifixcoinops

I am a slacker and I tail my log look to see what ip addresses are hitting me and then look up what asn range that's from and feed it into fail2ban for a while.

One of my sites they like scraping is a moin site.

Moin has a MonthCalendar macro that where you can click to a different day, and then to a different day, and I've decided any address indexing the calendar page after 2100 or before 1950 should be blocked.

I just need to write a fail2ban rule for that.

@alex has gotten much more automated with his blocking scripts

https://alexschroeder.ch/view/2026-01-08-shared-block-lists

@alienghic @ifixcoinops @alex the issue with block lists is it’s trivial to spawn more robots (maybe not forever with IPv4 addresses but anyway)

Instead of this β€œcaptcha” bullshit where I have to tick boxes with more than 10% motorcyle only, what if we positively identified humans and had a cryptocurrency-like scheme for maintaining the database?

Call it HumanCoin. It’s proof of stake. You can spend coins to browse websites, invite people to the system, or dob in someone for being a suspected robot. If a sufficient threshold is reached for a suspect ID, all their coins are burnt. The coins of whoever invited them are also burnt. A bonus is paid to the reporters (but not enough to incentivise abuse by bots).

I’ve thought about this for far too short a time for it to possibly work.

@ozeng @ifixcoinops @alex

The heavy handed solution is to block by IP address ranges.

For instance 216.73.216.0/22 is anthropic.

They're very aggressive, this is the count of times they're in todays access.log

8436 216.73.216.62

They've got a couple of IP address they use so off with the whole ASN.

The problem is there's a bunch of free apps or vpns that get money by using residential IPs as proxies so you can also get vast numbers of requests spread all over a vietnam telco, like

14.176.135.196 is in the range 14.160.0.0/11 VNPT-VN

and was behaving suspiciously.

@alienghic @ozeng @ifixcoinops my current setup is documented here: https://transjovian.org/view/fight-bots/index – some of these countermeasures are easier to implement than others, all need sysadmin skills and web server access, I think.