@zleap
At the center of this discussion needs to be 'why'? Who would define the definitive list etc? It could descend into blockchain discussion. To showcase stuff thumbnails fetched from #openGraph(?) URI headers should be enough.

We are also interested in censorship resistance, for the above. Think Tor and I2P URI support included in every instance by default.
@humanetech @tommi @bookwyrm @inventaire

@zleap @humanetech @tommi @bookwyrm @inventaire

How about this!

We already have #hashtags to list stuff, but they can get poluted by spammers.

What if (#gameChanger alert) you could #hashSearch an instance, ie. @openworlds.info#populationGrowth

Or search a user [email protected]#bailouts

Then you might get a more curated list of anything?!

(Silently bets that this is already a possible thing)

@dsfgs
> Or search a user [email protected]#bailouts
+1 for this.

Sometimes I have trouble finding my own post about a topic and I think to myself if something like this is already implemented. Since we're discussing this, it is possible that I'm searching for a string too (which I may have not added as a hashtag on the post). Being able to search for anything for just a user would be helpful.

@adnan360
'Amen' to not being about to find your own stuff.

It gets hairy with the 'search anything' idea though. Would it apply to things you boost etc. Using #hashtags is more sustainable long-term. Thing about 10 ppl searching for stuff at the same time on your instance. Its likely much more efficient (maybe even hardware optimisable?) to search on # only?

Also you need to give ppl the #RightToBeForgotten. Some people want to talk in a way that will not be searched, and should be respected.

@dsfgs
> Also you need to give ppl the #RightToBeForgotten. Some people want to talk in a way that will not be searched, and should be respected.
Even if this feature is not implemented, someone will find something someday. Maybe some web searches could be possible with "site:" tag too, who knows?

I thought there are also some privacy options (follower only/direct) if someone wants to stay private.

@dsfgs My goodness! I just found a solution to the problem while replying to you! 😆

Searching with "site:" on searx listed my posts with a specific string. It even found replies that I've done on someone else's post.

@dsfgs I searched something like this:

"health odysee site:mas.to/@adnan360"

@adnan360
Quite right. Currently #searchEngines don't respect #TheRightToBeForgotten and its just a shame actually.

Legacy search engines also are good at 'picking and choosing' what they will return. So if you want a more complete #search, you *may* be better off #searching from a #fediInstance

Yes, DM (PM) and 'Do not post to public timeline' are both possible.

@adnan360
We are not entirely right on that perhaps, searchEngines are supposedly respecting metadata in things like robots.txt and other areas as to whether to index a page or not.

@dsfgs Yes. If some instance admin wants, robots.txt can be setup.

Although I want my posts to be searchable. :) Most of my posts are public, so I don't mind search engines indexing them. It would help me later to find those.

@adnan360
Is there an #RFC in the #webStandards about NOINDEX/HASHTAG_INDEX?

That is where we are going here, right?

#hashtagIndex #bots #noBot #crawlers #noIndex

@dsfgs It is possible to restrict the whole site or anything under a directory. e.g. To restrict tag indexes (instance.tld/tags/xyz) from being in the search results, this can be added:

User-agent: *
Disallow: /tags/*

https://webmasters.stackexchange.com/a/16924

But it is still up to the search engine to respect this. A naughty search engine can still index it despite having it listed there.

How do I disallow an entire directory with robots.txt?

I have two sites I am currently working on which is a shopping cart and another shopper. Both are under the same domain name so for example http://example.com/first_url http://example.com/secon...

Webmasters Stack Exchange
@adnan360
Cheers yes, there are things like that for 'allow' and 'disallow'. We meant another qualifier(?) between those two extremes that told the indexer to only keep hashtagged words. :)
@dsfgs Do you mean only keep the hashtags indexed and not the rest of the post content?

@dsfgs If so, it will have to be done on the software side.

1. Search engine crawlers (usually) have a special useragent which can be detected and when it is, software can render posts without unwanted bits so that the crawler sees only the allowed parts.

2. Internal search will have to be updated so that unwanted bits are not searched.

@dsfgs If it is universal though, I don't think it would be ok. The moment I forget to hashtag a word, my entire post could miss my audience.

I think if something like this is implemented, there should be a per user setting to enable an option to restrict access.

@adnan360
Its a slight tradeoff yes. You can always, delete/edit a toot, if you make a mistake.

But really there is nothing wrong by doing it like that.

Maybe if its possible to mark a line as #lyric/#lyrical they can be #indexable too?

@dsfgs Hmm. I write a lot about Free Software projects. Some don't have a lot of other pages to describe them, so my toots often get listed. And someone may see that post and get into using the project. So having something like this enabled in my account would possibly not bring in those new users.

As much as I want to I can't add #every #possible #hashtag #in #my #post #because #it #will #look #ugly. (See what I mean? 😄 ) So I can't help it.

@adnan360
1. Good idea, yes! That's how Fediverse could achieve it. @Gargron @lain @bob ?

2. Yes. As we understand #mastodon software is already searching (functionally) like this.

@adnan360
Bingo, and for 50,000 instances hosting 200,000 ppl on average, we can quickly see why that would make things more workable.