Mastodawn

David Kobia Jan 25, 2023

Arvind Narayanan

Algorithms aren't the enemy. Chronological feeds don't scale and the signal-to-noise ratio will plummet if this ever gets popular. The real problems with today's algorithmic feeds are non-transparency, lack of choice, and optimizing for engagement instead of healthy discourse.

Open-source is a perfect opportunity to fix all this. Have there been any efforts to create a Mastodon instance with a (community governed) ranking algorithm? Is that technically feasible? Or is the idea simply anathema?

Arvind Narayanan Nov 5, 2022

Update: it turns out that lots of people have similar views and Simon willison is exploring building something along these lines.
https://fedi.simonwillison.net/@simon/109289663684761988

Arvind Narayanan Nov 20, 2022

Responses to some frequent comments:

* I'm certainly not suggesting that algorithmic feeds should be imposed on everyone! Choice is great. I recognize that many, perhaps most current Mastodon users like chronological feeds.

* "Reverse chronological" is an algorithm, albeit a simple one. It's currently the only option. Chronological feeds are not normatively neutral. There is, unfortunately, no neutral way to design social media. https://mastodon.social/@randomwalker/109308664849924122

Arvind Narayanan Nov 20, 2022

* "Mastodon doesn't need to become popular." Sure. But like it or not, it's getting more popular, and many of the newcomers have a different culture and expectations. Eugen Rochko: "People who are arriving now have as much right to be here and bring their own culture as the ones who came before them." https://mastodon.social/@Gargron/109323118267580967

Again, all I'm suggesting is choice, and I thought Mastodon is all about choice.

Arvind Narayanan Nov 20, 2022

* What do I mean by chronological feeds don't scale? A few things.

1. There's a lot of social pressure to follow people (especially people you know). Old-timers here are comfortable with following a small set of people, but most newcomers aren't. For those people, pretty soon the feed becomes a firehose.

2. Even in a mostly-chronological feed, some ranking would be really nice. *Not* necessarily by popularity, but if there are 15 posts by the same person I don't want those to be the first 15.

Arvind Narayanan Nov 20, 2022

3. As an academic, I use(d) Twitter to keep track of new research that's relevant to my interests. I found this much easier to do after I (reluctantly) switched to the algorithmic feed. Again, I recognize that this might not be everyone's experience, but I know I'm not the only one.

Arvind Narayanan Nov 20, 2022

Amusingly, this thread is getting boosted a lot today, a couple of weeks after I first posted it. That's another difference between chronological feeds and Twitter's algorithm (which heavily emphasizes immediacy).

But—and this is the point of the thread—imagine all the interesting things you could do with a tunable algorithm. You could even let the recency preference depend on the type of content! You could customize it to show you news only if recent, but educational content regardless of age.

Aranjedeath Nov 20, 2022

@randomwalker some kind of "code blob" that allows you to send your filterset/tunables to somebody else for temporary use/revision/adoption/etc would be extremely cool

Swen Nov 20, 2022

@Aranjedeath @randomwalker A highly configurable algorithm that could be shared to others with similar interests has been on my mind for a while. Configurations could be forked and refined by the community to improve them over time. Feels like it would give users much more control over what they are shown.

@randomwalker I wish algorithms were better at that. Instagram in particular loves showing me time sensitive posts 2-3 days after they stopped being relavent.

Dr. Angus Andrea Grieve-Smith Nov 21, 2022

@randomwalker These are important points, and I agree with most of them. My Twitter engagement was better under reverse-chronological sorting, but you rightly point out that that's still suboptimal.

When I say "the algorithm" for Twitter, I mean the machine learning recommendation engine. Reverse-chronological is an algorithm, but it's deterministic.

Having worked in various AI-related fields for years, I think it's important for the algorithm we use here to be predictable and understandable.

politiconomics Nov 21, 2022

@randomwalker hi! Reverse timeline fan here -- not bc it's the best, but bc the alternative was much much worse in my case.

You're suggesting users should tune the algorithm. Fine.

On twitter, the algo tunes the user. Unacceptable.

uhmmm Nov 21, 2022

The solution is getting back to personal web sites

Aggregation and curation would be done by hand, by people in telegram chats

Toby Phillips Nov 21, 2022

@randomwalker not sure if you’ve seen it, but the renewed interest could also be because of this useful thread and discussion from @jon, an ex-Twitter designer

https://social.lot23.com/@jon/109372292696300424

Jon Bell (@[email protected])

4/ I remember when we moved from strict reverse-chron (which 97% struggle with) to "You might like" prompts (which 3% struggle with) and hearing from the VERY LOUD minority that we were destroying Twitter. But we saw as the 97% had a much better time. We saw that every step forward we took (I have a whole presentation on this) was helping people more and more. The data told us we were making a better product. And that reverse-chron kinda sucks. For most people.

Hometown

rival Nov 22, 2022

@randomwalker Ain't that starting to look more like an internal search engine?
Interesting.
Maybe an implementation of YaCy over Mastodon...

Darnell Clayton

@randomwalker It will be interesting to see what @simon builds, but I think most of the Fediverse will avoid using it due to the current tech culture.

Jesse Nov 20, 2022

@randomwalker I am a proponent of understandable sorting/filtering based on likes and reblogs. I am not interested in algorithms which learn about you. Users should actively control the feeds and easily understand how they work.

D. G. Fitch Nov 4, 2022

@randomwalker We can certainly imagine a user-tunable choice algorithm, and try to make its decisions interpretable.

Have no idea if current architecture allows a feed algorithm, or if it would be extra work.

But I don't think any idea is anathema in federated land, things are just more or less work for devs doing this mostly-unpaid labor. (Thanks devs!)

Seven Dec 18, 2022

@dgfitch @randomwalker This would be preferable. I have avoided following large numbers and stuck to a chronological feed where it's made available knowing an algorithms tendancy of burying the wrong posts. If I knew how the feed order was being manipulated and was able tune as needed, I would be less hesitant to follow more people. It will be interesting to see what user-friendly means of controlling it are suggested - and how they are defined.

Michael Ekstrand Nov 4, 2022

@randomwalker I have been toying with the idea of building an opt-in Mastodon recommender (starting with who-to-follow, but hopefully timeline ranking too), then having my students build algorithms for it when I teach recommender systems in the spring. But unfortunately I don't know that I actually have the bandwidth for the engineering work to make it possible.

Michael Ekstrand Nov 4, 2022

@randomwalker It might be easier if an instance's admins got on board (maybe hci.social would be up for it?).

Giovanni Beltrame Nov 5, 2022

@mdekstrand @randomwalker I recently discovered the grouping mechanism, and I speculate it might do the job if one can reasonably make a dynamic group that acts as the ranker.

Nathan TeBlunthuis Nov 4, 2022

@randomwalker One challenge will be shifting how we imagine the relationship between the platform/instance and the user. Agendas for the design and mgmt of commercial platforms serve business strategies. Users are used to acquiescing to these agendas. Open source social media communities will have to settle on new agendas and goals and this is likely to be a messy process, because discussions often revolve around individual tastes and preferences and not collective goals.

Nathan TeBlunthuis Nov 4, 2022

@randomwalker Possible collective goals could be building the visibility and reputation of community members, disseminating important ideas, coordinating and mobilizing a social movement, hanging out and messing around. These goals don't have to be compatible, but we need to design algorithms and feeds to serve some such goals.

Taylor Beauvais Nov 5, 2022

@groceryheist @randomwalker
This "agenda" problem is central to all info dissemination/ aggregation #algorithms. Sorting and organizing #misinformation and conspiracy theories becomes too easy if a model goal is a healthy community. That's not a tech problem that's a psychosocial one. Healthy technologically mediated communities aren't profitable ones.

Nathan TeBlunthuis Nov 5, 2022

@taylorbeauvais

Do you mind elaborating on why you think "sorting and organizing misinfo if a model goal is a healthy community." ?

Taylor Beauvais Nov 5, 2022

@groceryheist
It's a clickbait problem. Conspiracy theories are more "sticky" as ideas than boring truth. Salacious #misinformation gets lots of engagement because it's not designed to be factual, it's designed to be attention grabbing.

Nathan TeBlunthuis Nov 5, 2022

@taylorbeauvais Yeah, so a lot of the known challenges with moderation and information quality seem easier when growth, engagement, and ad revenue aren't design goals, but these were the goals of the social media environments we are accustomed to. Communities on different platforms with different goals are likely to face different versions of governance problems. For example, social movement organizations, Wikipedia, and open source projects all govern speech, quality, and conflict.

Brian C. Keegan, Ph.D.Nov 5, 2022

@groceryheist @taylorbeauvais Agreed:

Mark Riedl Nov 4, 2022

@randomwalker there are instances that have implemented quote retoots (or boosts or whatever quote tweets might get called here), so I think as long as the instance members are cool with it

@riedl @randomwalker sure, but this post is why they weren't added in the first place..

Matt Krause Nov 5, 2022

@goabiaryan @riedl @randomwalker

I can sorta see that, but I also did like having a way to add context to a recommendation. A lot of science-y QTs added background information, like explaining why a thread might be interesting to a broader audience.

@prokraustinator @riedl @randomwalker yeah you can copy link for that and make ur own tweet/toot/twoot/post (whatever its called 🤣) .. ur audience can see it but theirs doesn't need to

Barry Devereux Nov 6, 2022

@riedl @randomwalker

It could be user-configurable, i.e.

◻️ Allow my posts to be quote-boosted
◻️ Don't allow my posts to be quote-boosted
◻️ Let me decide for each post

While quote-tweeting is sometimes toxic it also often drives discussion/engagement and it doesn't seem optimal to not have the functionality in some form.

Leena Nov 4, 2022

@randomwalker #transparent community based ranking - interesting :)

le Pétomane Ancien Nov 20, 2022

@leenamurgai @randomwalker Isn't that more or less the original reddit model? If so, I think there are some lessons to be learned.

Leena Nov 5, 2022

@randomwalker Mmm,... it depends on what you mean by scale. One thing I think is any algorithm that amplifies by rank is going to mean that people who just have less to say will be heard even less? Is that compatible with healthy discourse?

🚲Nov 5, 2022

@randomwalker chronological feed worked fine on Twitter 🤷‍♀️

josh buermann Nov 20, 2022

@dx @randomwalker Still works fine on Twitter, where the algo has never stopped trying to send me down a right wing rabbit hole any time I turn it back on.

Dan Nov 5, 2022

@randomwalker why do you think signal to noise will rise as usage does if you're choosing who to follow?

Leena Nov 5, 2022

@mil @randomwalker Does noise to signal ratio improvement scale without personalisation? I think scale suggests bigger is better, maybe not?

Mastodon allows some #control around its features at the #community level like maximum message length I hear. I often wondered if the length of the messages on twitter was a problem. Lots of the best stuff I read was in long and annoyingly formatted threads.

Dan Nov 5, 2022

@leenamurgai @randomwalker yes, I think that's a really interesting point. the user experience of reading and interacting is such a fundamental part of the experience that's entirely seperate from algorithmic content surfacing. a chronological timeline can still be rich and conversational if the experience is well designed

Erik Moeller Nov 5, 2022

It's a free world :). There are interesting forks and implementations with different behaviors, such as Hometown, which emphasizes local posting:

https://github.com/hometown-fork/hometown

Personally, I'd love the ability to temporarily turn the volume down for some frequent posters, or to have their stuff collapsed when the post velocity gets very high - that'd be enough for me at least at current scale.

GitHub - hometown-fork/hometown: A supported fork of Mastodon that provides local posting and a wider range of content types.

A supported fork of Mastodon that provides local posting and a wider range of content types. - hometown-fork/hometown

GitHub

Uncomics Nov 5, 2022

@eloquence @randomwalker For that you can mute people temporarily, though.

Sean Blakey Nov 5, 2022

@randomwalker Good question. On Twitter, I tended to follow a LOT, gave up on seeing everything in the timeline, relied on the algorithmic timeline, and aggressively used filters.

I'm not sure how to adapt that approach here, or even if I should.

Sean Blakey Nov 5, 2022

@randomwalker I'm tempted to relax for a while, get a better feel for how this will work, but eventually I may need to find/build a hackable client that has some sort of configurable tuning parameters for boosting and pruning a timeline view.

Sean Blakey Nov 5, 2022

@randomwalker I'm imaging an alt-timeline view that amounts to "show me toots from my home timeline (or local server or other?) over the last $TIME_INTERVAL, ranked by $CRITERIA, which may be evaluated over local context (boosts, keyword/hashtag-assigned bonuses or penalties, etc.)

Unsure if I would need deeper context (e.g. level of prior engagement with authors of those toots).

Denny Vrandečić Nov 7, 2022

@seanb @randomwalker the most difficult question is whether you want to store every post as seen or not

Sean Blakey Nov 7, 2022

@vrandecic @randomwalker Fuck it, let's just aggregate toots into a local Maildir and let notmuch sort it out.

Denny Vrandečić Nov 7, 2022

@seanb @randomwalker :D

Sven Slootweg Nov 5, 2022

@randomwalker Half the point of the fediverse is that it's not *supposed* to scale. It focuses on small-scale community building, with people you actually want to be with. The total usercount doesn't really change that, because your view into the network is (deliberately) very limited.

Sven Slootweg Nov 5, 2022

@randomwalker (In other words: the discovery mechanism is your social graph, and there's not *supposed* to be some magical global discovery mechanism)

Sven Slootweg Nov 5, 2022

@randomwalker Also, more generally: please spend some time in the fediverse and get to know the local culture first, before suggesting sweeping changes that directly go against what people are here for...

If people here wanted things to work like on Twitter, we'd be on Twitter

orko Nov 5, 2022

@randomwalker do we really think endless content delivery to each user is good? Fine with sorting but my Mastodon experience has been surprisingly not bad so far

SLaparle Nov 5, 2022

@randomwalker But wouldn't the loudest & most frequent speakers then control the WHOLE conversation? If someone is on here all the time, always tooting and always ranking, they would have significantly more impact on what everyone's feed looks like than the person that comes on once a day or so.

Just create some lists with your favorite posters. Mute over-posters. There are plenty of ways to customize your individual feed's signal-to-noise ratio without pushing algorithms on everyone.

Darnell Clayton

@randomwalker Algorithmic timelines are only evil when it’s the only choice or the default choice. People should be able to choose which one they prefer.

Both Twitter & Tumblr give me the option to choose between an algorithmic timeline or a chronological one.

Tumblr goes several steps further by suggesting sites based their recommendations, hashtags I follow, blogs I follow, trending & popular reblogs.

But both offer a simple chronological feed which is awesome.

Tim Panton Nov 20, 2022

@darnell @randomwalker I think you both might be missing the fact that the other folks on your server act as moderators for your local and federated feeds.

So if you join the right instance, 'local' becomes a human curated equivalent of the algorithmic feed.

That starts to break if everyone just uses the AI.

Darnell Clayton

@steely_glint @randomwalker I am the moderator for my server so that is not a problem. 😉
Algorithmic feeds are useful, but without an option for a reverse chronological feed they are in my honest opinion evil.
We have become too reliant on artificial intelligence to tell us what is interesting, instead of using the brain God gave us to figure that our for ourselves.

Atz Nov 20, 2022

@darnell @steely_glint @randomwalker There thing about OSS is having the choice.

I like to follow lots of people with diverse views. A reverse chronological algorithm means people who post infrequently will likely be lost in the noise. I wouldn't mind an algorithm that added some weight to their posts so I see them.

Andrew Hundt Nov 5, 2022

@randomwalker I think there are tight limits to how much #algorithm #choice will help because the vast majority of people rarely configure or change any settings on the apps they use, never mind algorithms. In fact a choice approach could become a way to individualize the problem and skirt #accountability.

Perhaps the algorithms aren’t the enemy, but in that case, we need to address the people who choose to engineer and deploy default algorithms implementing harmful but profitable #incentives.

Philipp Leitner Nov 5, 2022

@randomwalker it's not clear to me that there really is a scale problem with chronological timelines. At the end of the day an overwhelmed user can always choose to follow less people. That said, I can see that some ranked post view of a user's direct neighborhood may be an attractive addition / counterpart to the current options.