The #NYT New York Times has started blocking the #InternetArchive's web crawlers, going beyond standard robots.txt rules to cut off access entirely. The #Guardian is doing the same. The stated reason: fear of AI companies scraping their content. But the Internet Archive isn't an #AI company — it's a library, and it's been one for nearly 30 years. — #IA
#iloveinternetarchive

Blocking the Internet Archive Won’t Stop AI, But It Will Erase the Web’s Historical Record
Imagine a newspaper publisher announcing it will no longer allow libraries to keep copies of its paper. That’s effectively what’s begun happening online in the last few months. The Internet Archive—the world’s largest digital library—has preserved newspapers since it went online in the mid-1990s....


