I'd like to better understand the reluctance about full #search on #Mastodon. Is there an overview somewhere?

Working at Google on search, and with SEOs, I'm not used to folks publishing things publicly, and then not wanting them to be easily found. Is there something Google could be doing - in search or in the form of guidance - that would help keep folks comfortable, and still be useful?

@johnmu I'm both interested in the answer to this, and I like the way the question is framed ❤️
@johnmu it’s not like publishing publically here, not exactly. It’s more like talking on the village square, which is also public-ish but ephemeral; content even is, for some instances, removed after some time. This is not for archiving, therefore it’s not for searching. It’s also somewhat personal; you might not bother about others overhearing in the village but you wouldn’t want to find it in the newspaper.
@mirabilos @johnmu at what point does something like @signalapp become a better fit?
@danbri @signalapp @johnmu synchronous communication for one; also, I tend to use a messenger if one-to-one (or to-few) is the primary use case, and IRC or this if one-to-many/most is

@johnmu You've already read @Gargron's post?

https://blog.joinmastodon.org/2018/07/cage-the-mastodon/

Another post from my bookmarks:

https://www.hughrundle.net/home-invasion/

Moreover you could follow Hashtags like #FediSearch and #FediCulture and take a look at GitHub.

https://github.com/mastodon/mastodon/issues/21627

Cage the Mastodon

A year ago I wrote about Mastodon’s improvements over Twitter’s lacking protections against abuse and harassment. Development in that area has not been standing still, and it’s about time we do another comparison.

Mastodon Blog

@katzentratschen Thanks, very helpful!

One of the things I'm trying to understand better is why the general feeling is the content shouldn't be public to the world, and yet it is published as such.

To me, making that explicit (eg, a login-wall) would set clearer boundaries, but I realize I definitely don't know the whole history.

@johnmu You might want to have a look at the discussions around https://fedsearch.io/ which offered results from a global search across instances and was then stopped due to backlash from the community. https://infosec.exchange/@jerry/109311187050279476
FedSearch - Federated network search engine

Search for the Mastodon network

I think part of the problem is that many tools like Mastodon are ill-equipped to provide the privacy options that people want. Some people want their friends and family to follow them, but they don't necessarily want random people and trolls to find them. Platforms like Hubzilla are better for privacy, but if you make your posts private, you limit who can find you.

It winds up being a paradox of wanting like-minded people to find you, but not wanting trolls to find you.
@johnmu So, it's not that people "post things publicly and then don't want them to be found". People post things hoping to reach people they don't know, but want to connect with, whilst not wishing to connect with bad faith actors, bigots, reply guys and the like.

Locking their account down means that they can't connect to the people they are trying to connect with. Complete universal text search means that they are trivially found by bad faith actors. So the goal is to have content public, but a workable balance between being found by the people you want to connect with, whilst keeping the number negative agents low enough to be to be manageable.

And in that context, "public" vs "private" is too binary. The goal is the right balance between them.
@johnmu And another point of context is to remember that the fediverse was initially populated largely by marginalised folk who were trying to escape toxicity on other platforms.

The norms were established in that context, with marginalised folk trying to find and connect with other marginalised folk, whilst avoiding the bad faith actors that were
actively seeking to find them and harass them.

With the fediverse's eternal November in full swing now, the expectations of the new arrivals are different from the established norms, but crucially, the already established folk control most of the instances. Newcomers have the numbers, but not the ability to quickly change the norms of the pre existing admins, despite those numbers.

And that's why there is constant friction.
@ada @johnmu the previous norms were basically:
Be boring, ie, a computer programmer
Talk about incomprehensible computer programming
er, that’s it

I look forward to those old norms being screwed up and thrown in the wastebasket, hurry up and invent the new norms
#27YearMusicTechnologyGap, #YamahaQY700, #MacBookPro, #LogicProX, #spatial #audio, #synths, #synthesisers, #MIDI

@ada @johnmu I joined Mastodon back in early November and I much prefer its setup to the bird site, specifically the (current?) inability to globally search for keywords - using hashtags to open your toot up to the wider community feels much better.

Long may it continue.

Another thing to consider is that Facebook and Twitter normalized using your real name on your social media accounts. This is good, in the sense that your friends and family can find you, but it is also bad in the sense that trolls, employers, and malicious individuals can find you too. You can get fired or harassed over your posts. Before that, everyone used a handle and an avatar, identities were anonymous or semi-anonymous, and communities (typically discussion boards) were small. People want to be seen, but they don't want the negative consequences of the wrong people finding what they post.
@johnmu makes me wonder about structuring URLs and robots.txt. for example, to allow things like an indexable profile page, but not the individual posts. Maybe even indexable hashtag landing pages with curated descriptions.
@rowan_m all of that is doable & fairly standardized. Maybe it needs to be easier to implement or easier to understand the options?
Enable searching of all toots · Issue #21627 · mastodon/mastodon

Pitch Currently, searching on Mastodon in possible by linking an Elasticsearch instance. When enabled users are able to search the content of posts they have made, or have interacted with. Posts th...

GitHub

@johnmu

I feel like most people opining loudly about this do not know what the Internet was like before Alta Vista came about. Unless someone (or Yahoo!) had manually indexed a page, it could not be found - and half the links were broken. You'd be incredibly lucky to run into something valuable. Then search engines started working, and Google made them good.

Sure, ephemerality is a thing - but as long as the toot is still up and its URL allows public access, it should be findable by search.

@osma @johnmu I feel like Mastodon is in a “social layer” of the internet and can’t be compared to blogs and other websites. It’s more like Usenet, and if anything serves as a discovery mechanism itself.
@johnmu @osma …all of the Mastodonians that are anti-search can’t be expected to worry about how their posts contribute to a “web”. They are trying to have conversations in a polite party, and welcome strangers who are also at that party to eavesdrop and join in. But also fear that once Google indexes, it will tear down the walls of the ballroom and the party will be overrun by hooligans passing by, who have no interest in the party and only want to stir shit up for fun.
@johnmu @osma (Which makes me think that authentic social search would require login and respect visibility filtered by federation limits and blocks.)

@johnmu there's probably a fear of some old, off-the-cuff remark or reply being dug up years later to shame, as we've seen in cancellation cases. So people view the space as a public arena, but don't necessarily want to be easily dug up years later.

This is psychology, however, for if it's public then it is recorded somewhere & someone can store it. It's really false security not to have it searchable.

OTOH having only tags searched gives some control over what you feel make sense to search.

@Setok As someone who's off-the-cuff comments get published in industry news sites & taken vastly out of context, I'm sympathetic :).
@johnmu I guess to a company doing search, everything must be searchable ;) But then whose value are you interested in?
@johnmu @ricmac @timbray has a rather comprehensive recent write up. See also @bobwyman

@BudGibson @johnmu @ricmac @timbray
People claim to fear that trolls will use search to find threads that they can jump into and disrupt. This is a reasonable thing to fear. But, in order to prevent bad things from happening, we are being asked to give up all the good things that search might provide... It is a debatable trade-off based on feelings not science.

Note to @johnmu I also used to work on Google Search. I created Google's Prospective Search Infrastructure while in the NYC office.

@bobwyman @BudGibson @johnmu @ricmac

Um, Bob, did you read the piece? It has very specific proposals for a path forward on search.

@timbray @BudGibson @johnmu @ricmac
Tim: Yes, I've read it. But, I am concerned that some of your proposals imply legal recognition of license grants embedded in protocol data. At present, the law doesn't really provide for that -- but I am confident that, one day, it will. Today, I worry that your seemingly logical solutions create many slippery slopes that we haven't yet imagined.

Note: I was DEC's lead on software/content licensing and have several IPR patents. These aren't new issues.

@johnmu
1st link: https://fedsearch.io/ - follow the link "See for a discussion on the subject" there & read.
2nd link - a compromise proposal: https://social.tchncs.de/@[email protected]ial/109619582321411514
From the current point of view, a hashtag based global search seems to find broad acceptance.
My personal point of view is: if you publish something on the web, you agree that you will loose control who can see it. If you don't want so, it's your turn to prevent it.
FedSearch - Federated network search engine

Search for the Mastodon network

@johnmu let’s chat when I am back from bleh!

@johnmu Most people don't understand how to limit search engine indexing. But it's already possible to limit from the Mastodon settings (Preferences, Other, Opt-out of search engine indexing).

"Affects your public profile and post pages"

Sadly people don't adjust the default settings and blame search engines (instead of the default settings of Mastodon software). By default every post on Mastodon is publicly indexable.

@autiomaa Unfortunately, the "opt-out" only applies to new posts that you do. It does not apply to any of your replies. I don't know if that's widely known.

@johnmu Oh, that's a good point. Didn't remember. 😓

That is one of many architecture issues of Mastodon and other similar platforms implementing ActivityPub.

@johnmu Mastodon has a search engine opt out setting for each profile which sets a noindex meta tag (https://vis.social/@Luca/109620741021800649). Honoring that and the robots.txt should be enough in my opinion. But as you have seen, many people see that differently.

The majority of people I reach are fine with hashtags in public posts being indexed: https://vis.social/@Luca/109619582185050938

Luca Hammer (@[email protected])

Attached: 1 image @[email protected] @[email protected] @[email protected] I am here to be indexed and any robot can see that (see screenshot of preferences). I set up my own instance to have full text search. But I understand that others have other preferences and that even honoring the opt-out is not enough for them.

vis.social
Private and Public Mastodon

ongoing by Tim Bray
@johnmu
Hi John!
"I'm not used to folks publishing things publicly, and then not wanting them to be easily found."
It has to do with "locality".
Real world (aka faulty) comparison: you may want to add your name on your door bell. It can be seen by passersby from the street. It is "publicized publicly" but would you be happy if a fad startup built a worldwide database including your name, address and pics of your home?

@johnmu
The name on the doorbell is there for friends, visitors, delivery. Not for search engines, marketing, ads or surveillance capitalism. :)

I think the best thing Google can do about Mastodon and the Fediverse is to stay away from them.

@medecinelibre So your preference would be that none of the content here is findable on the rest of the web? It sounds very exclusive, if so, but ultimately this is up to each website to decide.
@johnmu It depends who benefits from and is in control of the index. Free software folks, who built and run Mastodon, are on a mission to eliminate the influence of GAFAM over the web.
There was a functional web before search engines and the Google search monopoly. We're waging the 2nd war of independence of cyberspace. Just add "monopolistic corporate behemoths" next to "governments" in JP Barlow's declaration. 😄 Mastodon is a political project.
Perhaps we need to re-introduce the idea of private communities. I am not sure if Mastodon has this capability, but in Hubzilla, you can create private discussion groups, and only members of the group can see posts. Search engines and outsiders can't even see it.

The problem with posting everything publicly, even if you make it harder to find, it still is publicly available.