New post from Joe Mullin, EFF's Sr. Policy Analyst: Publishers say they’re blocking the Internet Archive because they fear AI scraping. This is a misguided notion: it won’t stop AI, but it will erase the web’s historical record.

@eff has the story ⤵️
https://www.eff.org/deeplinks/2026/03/blocking-internet-archive-wont-stop-ai-it-will-erase-webs-historical-record

Blocking the Internet Archive Won’t Stop AI, But It Will Erase the Web’s Historical Record

Imagine a newspaper publisher announcing it will no longer allow libraries to keep copies of its paper. That’s effectively what’s begun happening online in the last few months. The Internet Archive—the world’s largest digital library—has preserved newspapers since it went online in the mid-1990s....

Electronic Frontier Foundation

@internetarchive That’s complete and absolute BULLSHIT.*

IA is a major content feeder to website scrapers.

* Apologies for the profanity, but strong language is called for in this situation.

@artandtechnic So what do you call BS on? That blocking @internetarchive crawlers won't stop "AI"? Or that it *will* prevent preservation of websites?

I wouldn't be surprised that "AI" companies scrape freely available data on archive.org, but they also hammer the rest of the open web like a constant DDoS attack. Blocking the IA is only going to hurt illicit "AI" scrapers marginally, but more so current open culture and future historians.

@fanden @internetarchive Your question is answered by the second sentence in the post you are replying to. Thanks for asking!
@[email protected] It really isn't, no. But I suspected you didn't have more than empty claims. Bye bye. @internetarchive
@internetarchive So start scanning the physical newspaper every day?

@internetarchive @eff
Most publishers never liked the archive, so I suspect they're just using this as a convenient excuse?

Besides, the digital lending feature requires an account, which AI wouldn't have access to anyway.

@internetarchive @eff They should know that AI is unavoidable at this point.
@internetarchive @eff It’s pretty obvious they’re going to publish stuff that doesn’t bear the touch of daylight.
@internetarchive @eff
There is possible room for compromise. If the publications are worried about losing readers to the Internet Archive, simply agree to embargo new content for a set period like 6 months.
@internetarchive @eff we need the Internet archive so yes such a policy is wrong.