If you've been waiting for full-text #search on #Mastodon, please go to #TootFinder and sign in. The more, the merrier.
https://www.tootfinder.ch/

Thanks to @buercher for building it.

It's opt-in, not opt-out. It respects Mastodon culture and doesn't index accounts that don't sign in.

That means the index might be small if we don't spread the word.

#mastotips #feditips

Tootfinder

@petersuber it's safe? 😅
@gubi @petersuber it’s not pulling anything that’s not already public

@ezekiel
That is not how safety and privacy works, I fear. Privacy is a friction landscape (as put by Luciano Floridi): even if something is out in a publicly accessible space, making it much easier to access (changing the friction) affects its privacy profile, and can be dangerous for someone.

Classic example was that data dump from okcupid.
@gubi @petersuber

@gvdr @ezekiel @gubi @petersuber

Interesting reference to "friction". In a social network, any additional search facility impacts on all users, not just those opting in. It does so through the out-of-context visibility of all posts that are referenced by accounts using the facility. And through the associated change in culture: people will look at posts differently.

Mastodon has an agreed search standard: each author decides how their posts are found, via hashtags. Stay with the standard.

@petersuber @buercher sounds like a great tool for trolls who otherwise wouldn't know who you were...
@SnapHappyFox @buercher
That's a risk I'll take — while supporting the default in which others don't take that risk. That's why I like opt-in.
@petersuber
Does the search also capture mentions in a toot? If so, isn't that partially forcing an opt-in for accounts you mentioned?
@SnapHappyFox @buercher
@gvdr @petersuber @SnapHappyFox As long as they are public
posts yes.
@buercher
So it's not an opt-in, and there is no way to opt-out. Cool coercive technology.
@petersuber @SnapHappyFox

@petersuber @buercher Hey folks! I did do this b/c I believe the fediverse should be searchable.

I've been asked if this means "can read DMs" and I honestly don't know (I'll never use DMs for truly private comms so I don't care, but others do).

@hrbrmstr @petersuber We work on narrowing the scope. We use grant only to get consent and then forget the authorization.
@buercher @petersuber pls accept that all of us have zero reason to “trust” I this climate. Mebbe try a bit harder to just get what is absolutely needed?

@hrbrmstr @buercher @petersuber

It would be easier to trust the claims if the software had an open source license so it could be inspected, tested, and improved upon. Just like Mastodon itself.

@buercher @downey @petersuber it’s all kinds of adorable that folks think anything at all on mastodon is safe or private rn.
@hrbrmstr @downey @petersuber I respect Downeys concern about access to the source code. This will happen. But I agree that even if it is open source, private data gets leaked when some persons want them to be leaked. In a social network, you do not control the people you follow.
@buercher @hrbrmstr @petersuber Yeah, I have no reason to distrust this indexer, but in general any indexer that requests the read scope has access to a lot of private-ish information. I don't see a scope that authenticates the user without granting some private information (mastodon should add one sometime), but maybe "read:search" is the most innocuous?
@gray17 @hrbrmstr @petersuber I understand it is problematic (Datensparsamkeit)

@hrbrmstr @petersuber @buercher Dial it back to `read:statuses` permission. I know the #Mastodon docs are crap. Live and learn.

(Saying “I don’t care about #privacy because I don’t use it” is a pretty crappy stance too. You’re throwing a service on the Internet with others’ information. Time to start caring.)

/cc @todb

@mjgardner @petersuber @buercher @todb Um. DMs are far, far from private (i.e., your stance that they are is pretty crappy)
@hrbrmstr @petersuber @buercher @todb I said nothing about DMs. I’m talking about having read access to all account data when all the app needs is statuses.
@petersuber what's ethically wrong with no indexing public information? I feel like this whole "we repect your data" thing is merely marketing when not applied to private data. I wouldn't care but this pattern of not indexing public data has demonstrably and arguably needlesy set back the abilities of the fediverse
@petersuber @buercher Neat, but a point of design: the gray-on-slightly-darker-gray hint text in the input boxes is not great for my eyes
@mjgardner @petersuber I work on that. I have seen the contrast depends on the browser.

@petersuber @buercher nice idea. I do not understand why it needs access to everything though?

This includes private messages and all those sent to only a specific audience. I only want the public ones to be searchable...

Why not just verify it's me and then use the public accessible timeline for the index?

@petersuber @buercher okay I did some reading, it says it is using the RSS feed for the index - I still don't understand why the request for everything is done.
@Fripi @petersuber I need some method for consent. Actually even scope "" would work for me but I think the OAuth standard does not allow it. I will try to narrow, but until now I only got errors from the instance.
@petersuber @buercher or just use #hashtags as per design? 🤷‍♂️

@thejasonhearne @petersuber @buercher Maybe I like to run much more specific searches than you, Jason, but hashtags are nowhere near precise enough for me even IF people were using them in every single toot.

Even when it comes to just searching my *own* toots--"where is the one specific dog photo I posted where the story I told included X?" is impossible unless I happened to hashtag that word, which would've been super weird.

@sstoneb @petersuber @buercher I get that. Totally. However that’s not #Mastodon. It’s not built to do that.
Great that it’s #opensource and folk have found a way to get it to do that. But it’s basically turning it into #Twitter which it isn’t.
@thejasonhearne @petersuber @buercher Really starting to get tired of this attitude on #mastodon of shaming users for wanting basic features. It's why most rationale people just throw up their hands and leave instead of putting up with this bullshit.
@thesuperpapagai @petersuber @buercher then use #Twitter I guess 🤷‍♂️

@thejasonhearne @petersuber @buercher I don't want people to use proprietary, centralized social media, that's the problem. The problem is that the most popular federated social media platform is basically taken hostage by a small minority of condescending idiots, which makes it harder and harder to get people to understand the benifits of decentralized social media.

Driving people off the platform will just embolded Elon Musk more, and shitty social media practices more.

@thesuperpapagai @petersuber @buercher I’m not sure I’d call the #Mastodon creators “condescending idiots”.
I get your point. I agree with the principle. But #Mastodon wasn’t built that way so it’s an #opportunity for a new #decentralised #platform to rise and then we can all let #Mastodon be #Mastodon.
@thejasonhearne @thesuperpapagai @petersuber So have we a federated system with a central instance deciding what can be done with it?

@thejasonhearne @petersuber @buercher

Your one short sentence sums up what is wrong with these additional search tools.

Mastodon has a search standard that is a deliberate design feature, supporting a particular communication dynamics. The standard is to let each author decide for each post how the post can be found, via hashtags.

One can't just introduce alternative mechanisms that deliberately violate those search standards and pretend it doesn't alter the system.

Use hashtags.

@the_roamer @thejasonhearne @petersuber Mastodon is published under the GNU license which explicitly allows modification.
Tootfinder allows each user to decide if the posts are searchable by full text (opt-in) or by hashtag only.
@petersuber @buercher @geneadons @genealogy @histodons Being able to search text is a useful research tool. Note that it is opt-in.

@petersuber @buercher great that search is opt in. Now I think about it, I won't.

As searchers we like to be able to search, so to help that is one reason to opt in.

But what about as being searched? What's the benefit - that anyone can find me or things I toot about quickly 🤔

I'm not sure I want that. I'm good with people finding me more organically and having to follow me and read my stuff if they see value in it.

That mirrors offline community better IMO and that's worth considering.

@petersuber @buercher immediately after tooting that reply I remembered I routinely use hashtags, which is a toot by toot opt-in search!

It's different, I control what is discoverable and how, but it goes against my reasoning somewhat so this needs more... 🤔

@petersuber @buercher The permissions request is for read access to "Everything" which feels overkill? Sounds like it might include DMs and stuff. I think this is just for authentication to the opt-in but perhaps a narrower permission is possible?
@DanielRThomas @petersuber we are working to narrow that, but get only errors for the moment. In the meantime, as a workaround you can immediately revoke the grant in your account settings after opt-in

@petersuber @buercher Question: just saw the authorisation dialogue and it says read-only access to “everything”. Does that include followers-only and direct messages?

If so, is this a limitation in the granularity of the Mastodon API?

CC @Gargron

@aral @petersuber @Gargron We are trying to find out. The API doc of Mastodon mentions finer granularity, but the proposed subscopes produce errors.
@buercher @petersuber @Gargron Thanks, Matthias. Looking forward to seeing what you discover :)

@petersuber @buercher I love to see this kind of organic and ground-up experimentaton with how to build out this platform.

Its such an empowering ecosystem, as opposed to being at the whim of behind-closed-doors, profit-oriented architects. #fediverse #mastodonmigration

@petersuber @buercher @stargazersmith

i know it’s a proof-of-concept so some features are missing; the one that leaps out at me is that CW’d content is not masked — maybe you should mention that in a message near the search box for now so people are prepared to see potentially personally disturbing stuff in the results

@alexch @petersuber @stargazersmith Good point. I will check if I can identify it from the feed,

@buercher

to be clear on what I suggest: until/unless you implement CW masking for real, add an explanation, e.g. inside the existing “Privacy Note” you could add:

* CWs are not masked
* DMs are (or are not) indexed

(DMs are also a hack in Mastodon, right? iirc they’re not really secret, just marked with a “direct” flag, so it’s important to clarify whether they are or aren’t going to appear in results)

@alexch It seems that content warnings are loosely formated in tweets as beginning with the text Content warning: in a strong tag. I can try to build a mask with that.

As for the DM. I have the impression they are not in the RSS feed at all, but I do not find anything in the documentation.

@buercher
ah, so you’re using the RSS feed not the ActivityPub feed, which means that there are two layers of jank on top of the AP spec (Mastodon overloads/misuses AP’s “subject” field as a “cw” field; then it apparently renders it into a mere HTML snippet inside its RSS output)

i bet you’re right that Masto omits DMs from the RSS feed already. i could look through the Mastodon source code later to confirm that, if you like