Turns out that AI bot protection isn't even a big problem anymore — residential proxies are, and the really aggressive scraping is hard to detect and block. A bit disappointing then that @cloudflare doesn't seem to have much in store to help against those. An increasing number seems to get past the Managed Challenge easily too, it seems.

And kind of ironic that, with most "normal" ISPs all renting out IP blocks to web scraping companies (residential proxies), VPNs and private relays/anonymizers have become more trustworthy traffic sources for website owners. 🤷🏻‍♂️

Either way, "AI" and scraping are the death of the open internet and free flow of information because it's clogging all the pipes.

#Webmaster #Firewall #Scraping #AntiWebScraping #AI

The Register: Publishers say no to AI scrapers, block bots at server level . “Online traffic analysis conducted by BuiltWith, a web metrics biz, indicates that the number of publishers trying to prevent AI bots from scraping content for use in model training has surged since July. About 5.6 million websites presently have added OpenAI’s GPTBot to the disallow list in their robots.txt file, up […]

https://rbfirehose.com/2025/12/11/the-register-publishers-say-no-to-ai-scrapers-block-bots-at-server-level/

The Register: Publishers say no to AI scrapers, block bots at server level | ResearchBuzz: Firehose

ResearchBuzz: Firehose | Individual posts from ResearchBuzz