It's strange to me that people are so focused on quote toots (which would be nice but I don't have particularly strong feelings about) when, in my opinion, what Mastodon desperately needs is a functional search bar. There is no better way to find out what is going on with a given topic.

Yeah, I know about hashtags, and I really don't think they are a suitable replacement.

I'd like to add here that the majority of concerns I have heard about implementing searches could be easily addressed by making them something you choose whether to be a part of or not at an individual or instance level. I get that they're not for everyone but removing the option for others bc you personally don't want it (and yet could opt in or out) is not a great look imo
At the very least allow instances to choose whether they are searchable or not and give people the option to join them if that's what they want 🤷🏻‍♂️

@AbandonedAmerica you may already know this, but searching is a much harder problem to solve. You can install a special search apparatus on an instance - it's a whole other application (ElasticSearch) that's pretty resource intensive. the functionality is still pretty limited, not sure if that's because of the resource requirements

I have a feeling this is a problem that will be solved by third party vendors, not instances

@AbandonedAmerica so, like there might be an ad supported site that you can go to search across many instances. or there might be a service you can pay for (monthly or per search?) that would provide full text indexing and searching.

I think many people would get pretty mad, but I don't think they can really DO anything about such services?

@nirak I feel like the lack of official implementation of the feature actually encourages unethical implementation of workarounds that don't respect whether or not people want to be a part of them, whereas an official solution could
@AbandonedAmerica the problem as I see it (being someone who hosts a small instance) is if it was built in AND required as part of the "stack" (the list of software required to run an instance) people who want to host their own instances might be shut out due to cost. More processing power = more $$$. I kind of like the third party idea but someone has to actually do it first :/
@nirak I appreciate your perspective on this. The cost aspect is something I hadn't really considered and am not curious about
@nirak @AbandonedAmerica As instances have scaled up capacity over the past few months, it's been interesting to see how much of the process has been manual, without any concrete set of guidelines. It seems like mastodon needs to move towards a more shrink-wrapped kinda package in a whole bunch of ways. If mastodon continues to grow, the likelihood of something like Red Hat springing up seems like a possibility, for better or worse.
@nirak my knowledge here is admittedly fairly limited so I appreciate the info 😊
@nirak @AbandonedAmerica I believe there was someone setting an instance up specifically for searching, but they stopped because of risk of defederation.

@blt @AbandonedAmerica yeah I could see that. a third party searching site wouldn't actually need to adhere to activitypub protocols at all though, it would just index (like google) and point to posts. it would probably need to adhere to the rules of regular web crawlers, though, not sure if most instances disallow indexing altogether

It's an interesting and quite hard problem!

@nirak @blt yes it is! I think as the fediverse grows this is going to become more of an issue though and I hope it can be addressed now versus when it has either led people to drop the platform or things are too unwieldy to manage. But obv my opinion isn't the only one
@nirak @AbandonedAmerica we've installed ES on our instance but it's too soon to say much about it either performance or cost wise, it's basically a trial rn
@mrcompletely @nirak I'd love to hear your thoughts on how the trial goes!
@AbandonedAmerica @nirak I'll follow up. Interestingly, the person on our server requesting that service is a major content creator and researcher, so is perhaps encountering some of the same use scenarios you are, driving you both to the same request
@mrcompletely @nirak thanks! And in this case it's not so much about BEING searchable, I'm fine where I'm at, but being able to research topics - that's what the key is for me

@mrcompletely
Might be worth noting that based according to the docs, the ES integration specifically doesn't do what I think @AbandonedAmerica is looking for:

"Mastodon supports full-text search when ElasticSearch is available. Mastodon’s full-text search allows logged in users to find results from their own statuses, their mentions, their favourites, and their bookmarks. It deliberately does not allow searching for arbitrary strings in the entire database."

@mhamiltonj @mrcompletely yep, that's worthless to me.
@AbandonedAmerica @mhamiltonj bummer. I get why they limit discoverability; it's all about making the platform hard to weaponize for targeted harassment. If that's one of the fundamental requirements it's naturally going to place a hard limit on some functions
@mrcompletely @mhamiltonj problem is that smaller scale solutions don't work when you suddenly have millions more users and much higher visibility. Bad actors will join, workarounds will be had. Harassers are going to harass, but you can't penalize people by removing a very basic platform utility imo
@AbandonedAmerica @mhamiltonj I think I agree but I am being very cautious and slow in advocating for specific platform changes, not because I don't think anyone should, but bc I'm a professional software design lead by trade and I'm aware of my own tendency to reduce all problems to software solutions, and the fact that I'm still learning about what all the different user-group needs are and how they might conflict. I've only been here a few months, so I am still in observation mode myself
@AbandonedAmerica @mhamiltonj however, as part of that learning process, I hope everyone who does already have well formed opinions and clearly understood needs to advocate for them so the conversation advances...I'm just remaining agnostic on major feature changes for awhile out of professional courtesy basically

@mrcompletely @AbandonedAmerica

Yeah, I'm sure this is a design choice driven by safety concerns from users who've experienced discoverability as a negative, not on cost/complexity grounds.

I don't experience the harrasment those communities do, and I've only been here just over a month, so I don't feel like I'm well placed to voice an opinion, although personally I don't really notice the lack of full text search.

@AbandonedAmerica and what happens if the search choses to ignore your selected option .. Mastodon isn't monolithic.
@0xc0ffea I mean, that would be something that would have to be addressed at an admin level. But just because you could hurt someone with a hammer doesn't mean hammers as a tool should be banned
@AbandonedAmerica For here, analogy is more like "why give people hammers when there are no nails."
Anything here that defers to instance admins to "figure out" on behalf of all their users or "moderate away" isn't really viable.
Federation only works if there is minimal fragmentation.
@0xc0ffea the analogy doesn't work because the nails here are the people that could use search and want it. The utility does exist. I get the concern about making life complex for admins but a simple opt in on both an individual and instance level seems like it would clear that right up

@AbandonedAmerica We kind of already have an opt in search, the problem is one of expectation brought over from twit (which keys off all words because it needs that aggregate data to drive trending and algorithmic content amplification)

We have hashtags, the use of one marks content findable and the user has to deliberately put them in. It's not the same as a monolithic platform search, but we couldn't have that even if we wanted it.

@0xc0ffea ugh hashtags are not great though. Either you add one for every relevant word, which is ugly and cumbersome, or you do the broad themes, which makes the finer points very hard to sift (in the archaeology tag doesn't help much if I want a specific discovery in Egypt). I am understanding the tech limits a bit better but I do think it's a discussion worth having and a problem worth addressing better
@AbandonedAmerica Also, a streamlined way to follow people from instances as well as see who they follow/who is following them. The current process is clunky, to say the least.
@AbandonedAmerica There are lots of ongoing discussions about this topic, too. Ironically, it's quite difficult to find them. 😬
@katzentratschen A++ reply right there lol

@AbandonedAmerica This thread by @johnmu discusses the broader topic of web search, but it's a good starting point and a rabbit hole as well:

https://mastodon.social/@johnmu/109618966198237280

🦇 Jennie Rigg 🏳️‍🌈 (she/her) (@[email protected])

Content warning: Masto meta, searching

witches.live
@miss_s_b I just skimmed that article and marked it for reading when I am able. Thank you!

@AbandonedAmerica Quite a few people seem to be very opposed to search because bad actors might use it to harass or dox people.

Those fears may be legitimate, but I suspect they underestimate how many people are already sitting on full-text indexes of fediverse posts. It looks to be a one-line change for a server admin to remove the current restrictions on ElasticSearch, and other ActivityPub software like Friendica already has full-text search.

@zak I think those concerns are totally valid too, just not to the degree of negating the utility of a tool that could easily be modified to address those concerns

@AbandonedAmerica I don't think there's a good way to address those concerns while maintaining a federated network that's open to the public. Bad actors will build search indexes regardless of anyone else's preferences and it's probably impossible to stop them by technical means.

It wasn't a problem when the Fediverse wasn't on their radar, but recent DoS attacks suggest that it is now. I don't have a good answer for people concerned they'll be targeted using search.

Meanwhile the rest of us are deprived of a useful search feature, and anyone who attempts to build one transparently gets bullied.

@zak yep, exactly this. 100%.

@zak @AbandonedAmerica

Been on 2 instances so far and both had full text search. FYI

@zak @AbandonedAmerica
As you say, this is an issue with the content being open. What seems to me to be needed is a way to flag content IN A PAGE as non-indexable...& then get Google/et al to honor that. (Which may be infeasible, I don't know how page crawling works at that scope.)

E.g., if you marked your post 'noindex', then your post isn't indexed.

It'd need to extend to quote-boost implementations, too.

@FeralRobots @zak yep. If you could do this, searches that respect it could flourish

@FeralRobots
There's also a noarchive value.

@zak @AbandonedAmerica

Let me know if I shall hook you up to SEO wizards here.

@RyunoKi @zak
Just to be clear in order to work in a social media context, the search spider would have to pass over (or the search engine not retain/index) anything in a container that's flagged noarchive or noindex, respectively.

Do you know if major search spiders work that way now? It's been too long since I had to know this stuff for work. This is only slightly more than curiosity for me, though that same may not be true for @AbandonedAmerica.

@FeralRobots @zak @AbandonedAmerica Deferring the question to @matterne

I haven't had to touch SEO in a few years so I'm not ahead of the curve right now.

@FeralRobots @RyunoKi @AbandonedAmerica
I think those flags are per-page for web search, but it would be easy for an ActivityPub server indexing posts internally to respect a per-post value.
@zak @RyunoKi @AbandonedAmerica
Yeh one of my tacit assumptions is that I'm thinking about non-fediverse search. I have trouble thinking about how in-client search scales past an instance. I'm sure it's not as bad as I'm imagining it, but at a certain size it kind of does have to get hairy.

@FeralRobots
Well, on the web client the time appears to be loaded with AJAX.

So unless there is a special treatment for crawlers they wouldn't see the toots.

Assuming noone queries the API and renders them server-side somewhere.

@zak @AbandonedAmerica

@RyunoKi @zak @AbandonedAmerica
you know honestly i hadn't thought about that, thanks.
@RyunoKi @FeralRobots @AbandonedAmerica
I'm pretty sure modern web crawlers run JS, but someone trying to make a dedicated Fediverse search would probably just pull from the ActivityPub outbox.
@zak
Ultimately part of the solution may be to allow people to mark individual posts as search indexable or not search indexable (like they're talking about for quotes) so there's an element of consent even if it's easy for bad actors to get around. That lets people have a usable search and let's people opt out of it. And it will be clear if someone is violating a poster's wishes including something in an index. But it might be a while before it all gets sorted.
@AbandonedAmerica @[email protected]
@zak @AbandonedAmerica The biggest issue with search is the bot harassment. Just go tweet the word crypto and see what mentions you end up with.
@AMS @AbandonedAmerica Crypto spam bots are another kind of bad actor. If I was going to write one of those (which I am not), my first thought would be to poll the federated timelines of big servers and filter them myself.
@AMS @zak oh, I encountered many of them. Still doesn't warrant not having a search function though