This is pretty wild — our friends at "Read The Docs" saw file download traffic drop by 75% after blocking AI bots.
https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/
This is pretty wild — our friends at "Read The Docs" saw file download traffic drop by 75% after blocking AI bots.
https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/
@stefan @paulshryock not sure about this modern technique of scaling up endlessly. Ive always been pretty happy with “when i get effectively DDOSed, I go down, im not paying for their abuse”.
I think the web would really be better overall if more people did that, possibly with “yeah the AI bots are trying to screw us again” announcements :(
@stefan I've over-ridden the robots.txt at my nginx load balancer for over 300 websites.
within the nginx server level I now have this:
location = /robots.txt {
add_header Content-Type text/plain;
return 200 "User-agent: *\nDisallow: /\n";
}
Am simply returning a disallow to all bots now... and backing that up with about a hundred blocks rules implemented against ASNs, user agents, specific IPs, and rate limits across the board.
Attached: 1 image I started redirecting #AI crawlers away based on their user agents and the 307s are now the bulk of my traffic. I sure do love the AI revolution
@stefan
Cumulative traffic for cromwell-intl.com plus toilet-guru.com running on one FreeBSD host in the Google Cloud has been averaging a little over $1/day in outbound traffic over the past year. On July 2, I added some commonly recommended AI-bot-blocking to robots.txt. A week later, traffic had dropped to a little under 50% of what it had been.
Blue in top of each = traffic Americas -> Americas
Yellow, #3 in each = traffic Americas -> EMEA
Purple at bottom = offset for free CPU/RAM tier