#News publishers limit #InternetArchive access due to #AI scraping concerns | #NiemanJournalismLab

As part of its mission to preserve the web, the Internet #Archive operates #crawlers that capture webpage #snapshots. Many of these are accessible through its public-facing tool, the #WaybackMachine. But as AI #bots scavenge the web for training data to feed their models, the Internet Archive’s commitment to free information access has turned its digital library into a …

https://www.niemanlab.org/2026/01/news-publishers-limit-internet-archive-access-due-to-ai-scraping-concerns/

News publishers limit Internet Archive access due to AI scraping concerns

Outlets like The Guardian and The New York Times are scrutinizing digital archives as potential backdoors for AI crawlers.

Nieman Lab