Why do I live in a world where I have to put this shit in my bio:

#nobot #noindex #noarchive #nosearch #nobridge #noai

in my profile, and pray that the techbro's bullshit bot respects it? Whatever happened to asking people? ๐Ÿ™„

@CoolerPseudonym @apps https://discover.holos.social/how-it-works says it all

so basically if you have any of these things set, then you're opted out.
So if EITHER
- you have 'indexable' turned off
- have follow approval required on
- have #nobot or #noindex in your bio
then you're opted out from holos

Holos Discover - How It Works

Learn how Holos Discover works - A Fediverse search engine that respects your privacy settings.

Holos Discover

#HolosDiscover is up to shenanigans again. Rather than making their scraper / indexer truly opt-in, they're again using @[email protected] to follow accounts that don't have search engine indexing turned off and don't have #NoBot and #NoIndex in their profiles.

I recommend blocking @[email protected] at minimum.

As a sidebar, I would support the existence of an actual opt-in index for reasons I have discussed previously. This is not that.

#Fediblock

Eve Ventually (@[email protected])

Content warning: Some thoughts about opt-in post indexing / not scraping and the realities of the fediverse

Toot.Cat

ใƒ–ใƒญใƒƒใ‚ฏใ—ใŸใ‚‰ๅŽ้›†ใƒ‡ใƒผใ‚ฟใ‚’ๅ‰Š้™คใ—ใฆใใ‚Œใ‚‹ใ‚‰ใ—ใ„ใ€‚ๅŸบๆœฌ็š„ใซใฏ้€šๅธธใฎใ‚ขใ‚ซใ‚ฆใƒณใƒˆใฏๅ‹ๆ‰‹ใซใƒ‡ใƒผใ‚ฟๅŽ้›†ใ•ใ‚Œใ‚‹ใฟใŸใ„ใ ใ‘ใฉใƒปใƒปใƒป

"When you block us, all your content is deleted from our index"

"We only index users who have not opted out. We check these settings before following:"
"User has not disabled the "indexable" option (enabled by default on most instances)"
"Account is not locked (no follow approval required)"
"Bio does not contain #nobot or #noindex hashtags"

Holos Discover - How It Works https://discover.holos.social/how-it-works

Holos Discover - How It Works

Learn how Holos Discover works - A Fediverse search engine that respects your privacy settings.

Holos Discover

@nodebb
I'm trying to figure out why my posts are being repeated on your platform, despite my account having #nobot #noindex, etc.

Care to comment on why you don't appear to respect people's requests for privacy?

JSON Column Search Takes 47 SECONDS Breaking Black Friday?!

JSON SEARCH NIGHTMARE! Can't index JSON columns in older MySQL! Every product search scans 2 MILLION rows! 47-second queries! Server MELTS on Black Friday! $4.7M revenue LOST! Watch the horror!

#sql #sqldisaster #json #performance #noindex #slowquery #sqlfails #blackfriday #sqlshorts #databasedisaster #sqlwtf #jsonsearch

https://www.youtube.com/watch?v=O0jCjaAsNNA

JSON Column Search Takes 47 SECONDS Breaking Black Friday?! #sqldisaster

YouTube

#HolosDiscover is back, from a clean slate.
A #Fediverse search engine that uses standard #ActivityPub federation. It follows you like any account, respects indexable flags, #nobot, #noindex, locked accounts. Deletions, edits, blocks are processed instantly through ActivityPub.
You have full control. Block it, mention it with "unfollow", or disable indexing in your settings.
Source code under AGPL-3.0 on #Codeberg.

Details: https://discover.holos.social/how-it-works

Account: @HolosDiscover

Holos Discover - How It Works

@apps

So this dead horse flinched again

"With #HolosDiscover we checked multiple criteria before indexing: "indexable" enabled, account not locked, no #nobot or #noindex in bio, not in opted-out list, only public posts."

I'll copy-pasta with only slight edits:

My entire point (all the noise notwithstanding) focused on

Default opt-in versus default opt-out

This is an agent --> recipient transaction

Default opt-in: the recipient is opted into (and receives) the agent's action whether the recipient --> knows of <-- the action or not

Default opt-out: the recipient is opted out of (and cannot receive) the action whether the recipient --> knows of <-- the action or not

Neither default opt-in nor default opt-out have any logical meaning if

--> THE RECIPIENT DOES NOT KNOW OF THE AGENT <--

in advance

There was no mechanism for prior notification *before* indexing

People would have had to stumble on what you're doing, by how, exactly?

How dose the recipient learn of what you've done *before* you do it?

With #HolosDiscover we checked multiple criteria before indexing: "indexable" enabled, account not locked, no #nobot or #noindex in bio, not in opted-out list, only public posts. Every deletion, edit or block was processed instantly via #ActivityPub.
Google uses that same "indexable" flag but ignores everything else, keeps deleted content cached for weeks.
We shut it down after pushback. Was that the right call? Don't hesitate to share, this concerns the whole Fediverse.
It should have stayed up
52.4%
Right call to shut it down
24.8%
No opinion
22.7%
Poll ended at .

@apps

[here we go again]

"Only public posts from consenting users are indexed. We respect every signal available."

But there is the entire problem in a nutshell:

What about people who have never heard of Holos Discovery and who never will unless they receive a Follow request?

Allegedly only consenting users are indexed

I have #NoIndex #Nobot set on my profile and I have to approve Follows

But during the convo I had earlier that person said:

"Your content has been removed from our index and you won't be contacted again."

To which I replied:

"So you've --> already <-- "scraped my content" without my knowledge or permission..."

I then asked about second-party scraping -- someone I Follow and interact with has been "Discovered" and Indexed -- and thus my content has been indexed too, without my knowledge or consent

At that, they never returned to answer

The core point is, historically a lot of people have not wanted to found, scraped, and indexed for unknown third-party use

Here we go again...