@timbray I think you're on the right track.
Reminds me of early flickr days where I was creating content and wanted it to be publicly accessible but constrained in how it could be used and really loved how creative commons licensing was part of their publishing workflow. If I could do something similar for my posts here that would be very interesting.
@timbray excellent - these are the conversations we should be having about Mastodon. Because frankly, here we can. It's not ultimately decided behind a corporate facade based on A/B testing and engagement metrics.
(I wanted to write something similar about QTs)
@szbalint @timbray I think there’s a big picture set of conversations that will need to happen this year. Search, QTs, algorithms, scale (multiple axes), moderation, and long term sustainability are on the list.
The big challenge is that we have none of the structure it took forever to establish for the open Web, and it’s not clear where those conversations will happen.
@timbray Indeed, an excellent article! Very good job sketching the various positions and Mastodon's current privacy issues. I'll have to think about content licensing as the model but it's certainly an interesting approach.
@szbalint I touch on somewhat similar points related to QTs in https://privacy.thenexus.today/black-twitter-quoting-and-white-toxicity-on-mastodon
@timbray thoughts on authorized fetch? ( https://docs.joinmastodon.org/admin/config/#authorized_fetch ) It seems like it shares some commonalities with your proposed step one particularly when used in combination with disallow unauthenticated access ( https://docs.joinmastodon.org/admin/config/#disallow_unauthenticated_api_access )
Of course we still have a long way to go with building on top of that for your other suggestions re: federation & data handling contracts but
@timbray Great thoughts. I do not know what the correct answer is, but I always appreciate your writing.
The biggest issue I see here is that because the network is federated, because there's hundreds of admins all over, this decision will have to keep being made by the community over and over.
I have questions about the practicalities of various search strategies.
1. How big will the search repository become? Twitter uses two big data centers and still outsources some of its activity to AWS and GCP, if various reports are correct. That is well beyond the capability of instance admins or even instance organizations to manage.
2. What is the time window of search. Right now, AFAIK, the hashtag search basically covers whatever the home instance has seen lately. That is a sort of ephemeral search that answers the question: "what's been said recently in my extended network?"
Twitter keeps everything from inception. That answers the question of "What has ever been said on this topic by anyone anywhere?" Those are two different search motivations and strategies, with huge cost differences.
3. If 3rd parties set up an expansive search engine, how will they pay for it?
If I have misunderstood the mechanics, please put me on the right path.
Sensible questions. But we have to get the policy issues right before we can even start worrying about them.
@timbray @vicuzumeri @BudGibson Tim, have you played around with Pixelfed? It’s not obvious on the app, but posting allows you to set the license and on the browser it shows you exactly what license is associated with a post.
A story of grappa: I visited Venice with some friends right at the end of the season. One of my friends was a regular, and her favorite restaurant set us up for their last night of the year. The food was spectacular, as was the wine. Then the house started pouring grappa. I wandered around empty Venice that night with friends, a camera, a tripod, and a brain full of grappa. This was one of the photos from that night. #travel #Venice #Italy #Night #canal #world
I am confused about where one can look for any policy decision in the Fediverse. If there's no centre of gravity for policy, the next best decision criteria is usually somewhere in the economics.
We will have to wait and see how this movie turns out.
FWIW, my expectation is that the Fediverse will evolve differently from previous Internet technology cycles.
To me, the big new factor is the growing maturity of open source technologists and developers around the world.
IMO, the hegemony of the Silicon Valley VC billionaire bros is up for grabs.
Mastodon is the work product of a German developer. The next killer app design iteration may come from Kazakhstan, Mexico or Nigeria.
If this occurs, the legal issues will be more complex ... maybe impossible to enforce.
A kick ass Kazakh server technology could become a huge factor if it were quickly embraced by India, Phillipines and South America. Or any combo like that.
Then who sets legal policy?
Right now, the monolithic SV companies are driven by US law and, increasingly, EU law.
But that could change.
@BudGibson @timbray I feel the same way, and feel somewhat similarly even on a technological front. implementing full text search for an instance is trivial, and I feel it’s inevitable that eventually a large instance will run software that supports full text search, and thus it will naturally have indexed a large amount of federated content.
a policy would have to be overlaid on top of the recognition that once the data is federated to another server, control has been lost.
Tim’s essay provokes a lot of thought, but is limited in scope to current (vanilla) Mastodon; I think that is this discussion’s undoing. much like the retoot argument, Mastodon may be the de facto featureset, but it’s hardly the only implementation.
It's not clear to me why the desires of people who have been here for years should be dismissed by someone who joined last week.
Perhaps, it's the people who want full search that ought to be on another platform.
I know precisely what protections that web can and can not provide. Or do you also think I have access to all your credit cards and passwords?
@timbray coming from the dev world I the licensing idea resonates. We’ve come far enough that BSD / MIT / Apache / CC are a widely understood shorthand .
Then an inner voice pops up; “there’s little chance that the general public could become conversant to the same extent”
And a left-over idealist whispers “but what if they could? I mean, we teach children consent these days…”
Not of the admins block it with robots.txt
@timbray @codeyarns not so fast! If my post federates to another instance that doesn't block search with robots.txt, then it 's still likely to wind up in a search engine (even if I've selected the "opt out of search engines" option). I've even found federated versions of deleted posts via search engines.
As you say, Mastodon's privacy story has problems.
The frustrating thing is, local-only toots help a lot here ... but they've been blocked from the main branch since 2017
If your post carries clear licensing restrictions, there's then a legal club available to beat anyone who misuses it. That's my piece's central point.
@jdp23 @timbray @codeyarns
Robots.txt is a blunt instrument; rel="" attributes & meta-tags are more precise.
That said I'm not sure they currently have what's needed, either; I expect Tim would know, though, if anybody does.
@jdp23 @timbray @codeyarns
The analogy I would draw is that robot exclusion say who's not welcome & where (which may be 'everyone' & 'everywhere'), & meta tags say what content is off limits.
Again, correction welcome. It's 9 yrs since this has been something I needed to understand for professional reasons.
@mc @timbray @codeyarns yeah. i think of it as basically a prototype at scale of a decentralized system, so there are a lot of issues related to decentralized moderation that aren't yet addressed.
It's hard to know about the privacy model in general. For search in particular, opt-in could have worked if carried through the whole code, and I think people would have been okay with it. But that's not how things played out and it seems hard to retrofit.
Even better, don't write posts that are so long that you need it.
@timbray - "People should be able to converse without their every word landing on a permanent global un-erasable indexed public record."
I wholeheartedly agree with this, which is why I find it surprising the people who desire that would use a service that packages up those words and fires them off to hundreds of servers beyond their control. There are a multitude of options for folks who want to have that kind of insular community, and Mastodon/ActivityPub is *fundamentally* not that.
@timbray Maybe surprised isn't the word, because I'm not actually surprised most users don't understand what the Fediverse is and react with anger when someone shows them.
The word is probably disappointed. I'm disappointed because I thought the ideas of interoperability, composability, and decentralization were gaining momentum, but they weren't. To most users, Mastodon is just the new "app"
@timbray Good essay. The bit about having lots of content licensing options seems confusing. Maybe if there are just 2 choices, "private" or "public" or "free" vs "commercial"... but any more would be a nightmare.
Also, the weird sentence stands out: "Missing the point...": I have never seen anyone miss that point from any side of this debate.
@timbray Commented directly, awaiting moderation.
As an aside on commenting, I know you rolled your system a while back but there’s a lovely @w3c Recommendation called Webmention that would’ve let me post a reply on my site immediately while awaiting moderation to appear on yours: https://indieweb.org/Webmention
It’s like WordPress-style pingbacks but better spec’d and with room for things like a WoT-style antispam extension to shift the burden of trustworthiness to the sender: https://indieweb.org/Vouch#Shift_Burden_To_Sender
@timbray @jnm re: “The Fediverse needs to get its content-licensing shit together.”
This is actually a nice @pixelfed feature. You can set a license for each post. Screenshot attached.
I’m not sure what enforcement there is, if any. It’s fairly buried with no obvious way to set a default. It doesn’t appear in the main post view.
I still use it every time 😃
Here’s a PixelFed post I added a license to - can you find it? https://pixelfed.social/p/peterbronez/514067269392443567
@timbray @jnm @pixelfed related:
Megaface was a facial recognition dataset created from CC licensed Flickr images. It was ultimately decommissioned due to licensing objections: https://exposing.ai/megaface/
Discovered via this HN discussion https://news.ycombinator.com/item?id=34213036
Attached: 2 images While we’re talking about interesting ideas for approaching harassment, rather than forcing users to use hashtags to opt into search I’d love for Mastodon to implement an idea to I proposed at Twitter. People should be able to opt in and out of having their public posts be searchable on an as needed basis, no hashtags needed. https://fabisevi.ch/2022/04/01/goodbye-fellow-tweeps/#fnref-1