Someone is building a "global fediverse post indexer" that:

* scrapes the public APIs so it can't be blocked via defederation
* uses a bunch of dynamic IPs so it can't be banned at network level (hilariously, the author redacted this part and forgot that the edit history can be viewed by anyone)
* can be blocked by server admins via robots.txt, but they're planning to publish which instances are opting out (right now this is "open for debate")
* can be blocked by users by disabling indexing in the profile settings (!) or adding a specific hashtag to their bio (!!)

There's ZERO mention of opt-in, a lot of pushback against anyone who dares calling this thing a scraper ("we're using public APIs, so we're not a scraper") and the inevitable "we got complaints only from people who have something to hide".

With this attitude, I wonder how they're going to respond to the first GDPR compliant they're inevitably going to receive, it'll be fun 🍿

@rfc1459 oh no  can we have a link to the discussion or something similar related to this?
Matt Cloy (@[email protected])

So I made a "pending review" decision on the fediverse full-text search engine we wrote - uses the public API, which means it can't be defederated, and it [redacted], filling out robots.txt is the solution for hosts, or as a user set your profile to do-not-index on mastodon and/or add #noindex to your bio). Available under login ONLY to *verified* instance moderators (and only searching federated instances of that mod). I.E. if the server is defederated from your instance, their mods can't search the commons for anything. *Constructive* feedback on this welcome (including thoughts on adding watch-phrases for flagging abuse patterns for review, making robots.txt-banned instances public, or anything else that improves moderation), please let me know NOW not later. Would rather a discussion before the cat is out of the bag than afterwards. #fediblock (because I know that hashtag will get me feedback) #flameproofpantstime

TechHub