"we have noticed an increase in abusive site crawling, mainly from AI products and services. These products are recklessly crawling many sites across the web, and we've already had to block several sources of abusive traffic."

"One crawler downloaded 73 TB of zipped HTML files in May 2024, with almost 10 TB in a single day."

"By blocking these crawlers, bandwidth for our downloaded files has decreased by 75%"

https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/

via @stefan

AI crawlers need to be more respectful

We talk a bit about the AI crawler abuse we are seeing at Read the Docs, and warn that this behavior is not sustainable.

Read the Docs

@ethanwhite @stefan

"One crawler downloaded 73 TB of zipped HTML files in May 2024, with almost 10 TB in a single day. This cost us over $5,000 in bandwidth charges, and we had to block the crawler. We emailed this company, reporting a bug in their crawler"

Well, their ai is shit then, if they can't even write a crawler properly.....