Mastodawn

RE: https://tldr.nettime.org/@tante/116605858023186072

Google Search rests on a social contract: their bots can crawl our sites, they can index our sites, and they can show excerpts of our sites because

and •only because•

they send people to our sites. •Our• sites, our words, with our design, with our links, with our context and our aesthetics, shared the way we want to share them.

Google is announcing — unambiguously and with great fanfare — that they are now fully breaking that already-ragged contract. We should reciprocate.

1/2

Show thread

Paul Cantrell 1d ago

Quick strategy discussion, for those who understand Google indexing and SEO:

If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

2/2

Show thread

Paul Cantrell 1d ago

Same question as the previous post, except for Wkipedia. What would you like to see them do to send a shot across the bow?

Or…well, it’s Wikipedia. Maybe more like a shot to the hull.

3/2

Show thread

Paul Cantrell 1d ago

Going with meta noindex for now. My thinking is that this actively tells Google to yank already-crawled content from their index, whereas they might take a robots.txt entry to mean “do not update, but keep showing last fetched.”

Show thread

Paul Cantrell 1d ago

OK, a •lot• of replies need this reponse:

Yes, of •course• they will start ignoring robots.txt etc as soon as they think it hurts their business. Of course.

It is important to •force that fight•, rather than just capitulating in advance.

Show thread

Paul Cantrell 1d ago

Defeatism is a form of surrender. Cynicism is surrender. Despair is surrender. Nihilism is surrender.

Our job is to •care• and to •keep caring• and to •keep doing and keep building• and to •endure• longer than them.

Show thread

Jed Brown 1d ago

@inthehands It's important to note that search indexing is considered "transformative" and thus fair use *because* it does not supplant the market for the original content. That goes out the window when the product functions to capture traffic that would otherwise go to the cites. They are acting with impunity, but existing copyright law addresses this if courts find it to be not transformative.

Show thread

LovesTha🥧

@jedbrown @inthehands sure, but I'm pretty sure US law would consider ignoring robots.txt as hacking ;)