Mastodawn

RE: https://tldr.nettime.org/@tante/116605858023186072

Google Search rests on a social contract: their bots can crawl our sites, they can index our sites, and they can show excerpts of our sites because

and •only because•

they send people to our sites. •Our• sites, our words, with our design, with our links, with our context and our aesthetics, shared the way we want to share them.

Google is announcing — unambiguously and with great fanfare — that they are now fully breaking that already-ragged contract. We should reciprocate.

1/2

Show thread

Paul Cantrell 1d ago

Quick strategy discussion, for those who understand Google indexing and SEO:

If I want to yank a web site out of Google’s now-fully-extractive search, should I (1) disallow googlebot in robots.txt or (2) add `<meta name="googlebot" content="noindex">` to all the page headers?

The goal here is not just to remove my contributions to the commons from Google’s results, but to •make Google aware• that sites are pulling consent. What will best do that?

2/2

Show thread

Paul Cantrell 1d ago

Same question as the previous post, except for Wkipedia. What would you like to see them do to send a shot across the bow?

Or…well, it’s Wikipedia. Maybe more like a shot to the hull.

3/2

Show thread

Paul Cantrell 1d ago

Going with meta noindex for now. My thinking is that this actively tells Google to yank already-crawled content from their index, whereas they might take a robots.txt entry to mean “do not update, but keep showing last fetched.”

Show thread

Paul Cantrell 1d ago

OK, a •lot• of replies need this reponse:

Yes, of •course• they will start ignoring robots.txt etc as soon as they think it hurts their business. Of course.

It is important to •force that fight•, rather than just capitulating in advance.

Show thread

Jeff

@inthehands in forcing that fight, google is going to find that the rest of the internet already has sophisticated tools for this fight. My anubis config should already be blocking google.