Mastodawn

Tea Mar 21, 2025

Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives.

https://programming.dev/post/27302847

Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives. - programming.dev

Lemmy

Show thread

lily33

while allowing legitimate users and verified crawlers to browse normally.

What is a “verified crawler” though? What I worry about is, is it only big companies like Google that are allowed to have them now?

Show thread

wingiee Mar 21, 2025

I assume a crawler which adheres to robots.txt

Show thread

lily33 Mar 21, 2025

I would love to think so. But the word “verified” suggests more.

Show thread

killeronthecorner Mar 21, 2025

IP verification is a not uncommon method for commercial crawlers

Googlebot and Other Google Crawler Verification | Google Search Central | Documentation | Google for Developers

You can check if a web crawler really is Googlebot (or another Google user agent). Follow these steps to verify that Googlebot is the crawler.

Google for Developers

Show thread

melpomenesclevage Mar 21, 2025

I dunno. I don’t find any sympathy with any of these fuckers though. this is not a generally useful technology, it is not something the average person ever needs to see, and honestly, just fuck em. Fuck anyone messing with open source to engorge the garbage dispenser.

Show thread

lily33 Mar 21, 2025

Any accessibility service will also see the “hidden links”, and while a blind person with a screen reader will notice if they wonder off into generated pages, it will waste their time too.

Also, I don’t know about you, but I absolutely have a use for crawling X, Google maps, Reddit, YouTube, and getting information from there without interacting with the service myself.

Show thread

melpomenesclevage Mar 21, 2025

yeah. it’s pretty fucked. hopefully it’s temporary.

so do we make everything inaccessible to everyone, or just inaccessible to disabled people? we don’t have a way to include them yet. we should work on it, but we are not the ones who fucked accessibility.

yeah. search engine web crawlers are a public service. they are responsible. but we are in a conflict. we must struggle tooth and nail against capital for every nice thing.

Show thread

killeronthecorner Mar 21, 2025

I’d assume they’re using aria tags to hide the links from screen readers, at least that’s what the article seems to imply.

Show thread

fuckwit_mcbumcrumble Mar 23, 2025

Cloudflare isn’t the best at blocking things. As long as your crawler isn’t horribly misconfigured you shouldn’t have much issues.