- and let's not forget: Besides all the content theft #AI stands for, people also pay for traffic, which means that random website owners are paying through the fucking nose for traffic to their sites that's just AI crawlers.

I've seen logs for websites showing traffic to drop by 70-80-90% just by blocking those crawlers; these are non-trivial expenses.

Or, in other words, these companies aren't just stealing people's content - they're stealing their actual money.

https://indieweb.social/@dealingwith/113731892010994341

dealingwith (@[email protected])

"This is literally a DDoS on the entire internet. "I am so tired." https://pod.geraspora.de/posts/17342163

Indieweb.Social

@jwcph

Do #AI crawlers respect robots.txt?
If not, how do you block them?

@jtgd I don't know - and I understand why you didn't just google it; the search engines are almost certainly downranking that kind of results... I know people are sharing how-to's on this, but I honestly have no idea how to find them 😬

@jwcph

What a concept! Search for it!

Thanks, I did and it seems robots.txt is supposed to block them, assuming they honor it. My robots.txt already blocks everything to everybody so fine, but I should keep an eye on who's connecting to me.

https://www.cyberciti.biz/web-developer/block-openai-bard-bing-ai-crawler-bots-using-robots-txt-file/

#AI #crawler

How to block AI Crawler Bots using robots.txt file - nixCraft

Here is how to block generative AI (OpenAI ChatGPT, Google Bard, CCBot Crawler bots) using robots.txt to protect your content.

nixCraft