Among the many things Doctorow gets wrong in That Post is this:

"It's not 'unethical' to scrape the web in order to create and analyze data-sets. That's just 'a search engine.'"

Apart from the fact that AI companies are particularly malicious in the way they scrape the web, I'd say we accept search engine scraping mostly on the premise that it's done for the benefit of the scraped sites. There's no such principle of mutual benefit in AI scraping — the AI company gets the value of the data scraped and you get bupkis at best, and possibly DDoS'd

@lrhodes

"I'd say we accept search engine scraping mostly on the premise that it's done for the benefit of the scraped sites"

I would qualify this somewhat by pointing out how, independent of AI, this acceptance ultimately led to Google benefiting from scraping websites at the latter's expense. The value proposition of Google indexing your site is it draws more visitors to your site who may not have known about it otherwise.

@lrhodes This isn't an absolute good - some sites might not want that kind of attention - but it's easy to see why it might appeal to a large number of people. Once Google starts selling ads, though, that value proposition tilts against their favor; those websites become competitors for ad impressions (clicking through a given result means users spend more time there and less on Google).
@lrhodes Fortunately for Google, they already had an effective monopoly on the search engine business by this point, so it was easy to scrape those sites for data to power the features that kept users on Google.