Oh, so now scraping data without permission is bad for AI training? 😂 how ironic 😉

Anthropic accuses Alibaba of using thousands fraudulent accounts to extract Claude AI model capabiliti and data. Anthropic urged Congress to penalise the companies behind scrapping attacks like this and to ramp up measures to prevent US tech from being stolen. https://www.bbc.com/news/articles/cwyklykn5dwo
How about Anthropic pay first for stolen books, and all content out there for its shity ai?

Anthropic accuses Chinese rival Alibaba of illicitly extracting AI capabilities

The firm alleged that Alibaba used fraudulent accounts to access data from its Claude AI model.

@nixCraft In my experience it's not enough to merely #block said #AIscrapers, but literally necessary to fight back by sending them *malicious data* with EVERY REQUEST* whilst rate limiting to a crawl to combat their literal DDoS-Attacks!

(max. 1 connection at 75 bit/s per IP & request max. 1 request per IP, 120s crawl-delay enforced, redirecting them to EICAR "Malware" every time they violate said limits, commit Blackholing at Upstream / IX-Level)

@[email protected] @[email protected] you can't enforce that policy.
you don't know which ip is a bot or a user.
and they do only one request by hour... just they have millions of IP (proxy by android app or something like that).
but for cloud ip, yes it's a good policy.

@oldsysops @nixCraft WATCH ME!
Cuz this is my daily doing!
- First of all, I'd block all non-Consumer Networks (i.e. entire ASNs associated with said #AIslop corpos - like #aws & #Azure!)
- Then I do check for #UserAgent and block known #bots like #ByteSpider and dynamically block entire IP allocations (minimum /24) as they get used.
- Whenever I can, I geoblock places that are notorious fir #Cybercrime (Russia, "P.R." China, USA,)

https://www.youtube.com/watch?v=Hi5sd3WEh0c

The creators of TikTok caused my website to shut down

YouTube

@oldsysops @nixCraft I already worked woth others to get #KiwiFarms banned off #ClownFlare, and if I can get a known #RogueISP (that AFAICS only Cybercriminals use!) to fire a client, then I can get #ISP|s to go after "#ResidentialProxy" setups by #AIslop firms for violating their #ToS and creating a shitton of expensive traffic…
- The best way to get corpos to move their asses, is to make something their [expensive!] problem!

Don't forget to automate abuse reporting!

#CloudFlare