@thibaultmol @jwildeboer Yeah, I would be very interested in doing this for matrix but so far I have not found a dataset that would enable me to do so - and I don't know matrix well enough myself to try to collect one.
I am also interested in doing this for email, but that is thornier - in part because the data is valuable to marketers and therefore the sources I've found that *might* work are expensive, and I'd have to pay to even find out for sure
@kingmoth @jwildeboer @ricci I don't like federated systems where their apps (at least the ones on F-Droid) recommend a single server, it tends to cause this effect.
That being said, if a dev wants to add a server recommendation system to make it easy for new users to not burden them by having to choose a server it could choose a general server randomly from a list which would not include too popular servers (to avoid the centralisation issue) nor too small servers. Well, what if server owners could opt in such a list? 
@Commander1024 what's your source for that? https://fedidb.com/stats suggests Hetzner is responsible for about 9.3% of the fediverse.
I would say AUTOMATTIC is overrepresented because Wordpress.com blogs aren't as important for the fediverse as the number of them would suggest but still it would be nowhere near 70% even if you just removed AUTOMATTIC. DigitalOcean seems to be ahead of Hetzner too.
@Commander1024 @jwildeboer @ricci
Hetzner has 9% of all instances and 36% of all accounts.
ref: https://fedi.wrm.sr/
@Flo_Rian @jwildeboer That's an interesting suggestion, and I'm divided on it. On the one hand, I don't think it's 100% clear that the breakpoints used in antitrust are necessarily the right ones to use in this context, so staying fuzzy on the actual gauge might avoid suggesting sharp cutoffs where they don't really exist. (I did feel the need to include them in the text to give *some* context.)
On the other hand, I do think it would be good to show that there is still a benchmark for the fediverse to hit, that it can still get better
@ricci
I just had a hard time reading the graphs at first. Gauges (in reality) usually have a scale or zones, so we expect to be able to tell if we are in a certain range or hit "the red zone". In this case, one seems to have zones (green area, with the needle just on the edge), while the other one is all red.
To avoid sharp cutoffs, it could also be one large gradient. Alternatively, not using a gauge graph would be an option, maybe a partially filled bar chart.
If Train AP is travelling to Destination D and left the station July 18, 2017, and Train AT is also travelling to Destination D and left the station February 22, 2024, is Train AT allowed to even call itself a train?
Calling AT out for not being decentralised enough, yet, distracts from the shared mission IMO.
The AT Protocol has left the station. It could easily overtake ActivityPub on a parallel track. I think we have more to gain by focussing on the common good.
@jwildeboer I think it's important to note that this debate has alienated the entire ActivityPub community from one of the most successful digital communities for marginalised people we've seen in quite some time.
Blacksky is accomplishing tremendous things on AT, and AP has a lot to learn from that endeavour.
Rudy deserves better from the ActivityPub community.
Do I read this right that Bluesky is about 12x bigger? And I assume we're counting active users?
This is complicated.
What I'm counting is the PDSes that the bluesky relay reports knowing about, and how many repos (user data) they report having.
The number this returns is different from what you may see in other places. You'll see 38M for the number of total accounts. You'll see a few million listed for Monthly Active Users. My understanding is that what the relay reports is *some* measure of active users (it's the number of accounts the relay is currently maintaining data for) but it's not exactly MAU.
The Fediverse numbers are MAU, but it's also notorious that you can't get a fully accurate number for this because not all instances choose to opt in to such data collection, often by misreporting numbers.
Thank you for explaining this to me. As might be evident from my profile, I'm not technically more that passingly literate.
@jwildeboer i agree with the sentiment 100%, but to be clear Bluesky isn't any more or less decentralized than mastodon.social. Bluesky is a service that runs on atproto. Just as mastodon.social is a service that runs on ActivityPub.
atproto is theoretically decentralized in a similar way to ActivityPub, but there's just no one else of any size building on it. atproto is decentralized in theory, but not in practice. Hence the terrible graph you posted.
@jwildeboer I agree with that 100%. At this point all we can hope for is good interoperability between the protocols.
@xerz @jwildeboer Oops! Thanks for pointing this out!
Fixed
@can @soop @jwildeboer One of the interesting federating aspects of git is that it can be (and is :) ) in both places at once - I set it up on github because I figured that'd maximize the chance of getting pull requests, but I've now moved my primary copy to codeberg. I can accept PRs from either place and push to both, which is cool.
git is also a good cautionary tale, though, of the fact that just because the protocol itself is radically decentralized doesn't mean the deployments will inevitably be so. A lot of us are (rightly, imo) worried about the dominance of github even though we can take our repos anywhere we want with relative ease.
I don't really think PDSes are the ultimate measure of decentralisation, the benefits of decentralisation show at the appView layer more.
PDSes are just dumb data stores. Admins of them can't do much bar removing posts or controlling your identity.
@irelephant @jwildeboer In a recent interview with Rudy of blacksky, he described the PDS as the most complicated part (and they should know, they have re-implemented or forked basically all the parts now). Yeah this surprised me too.
The anecdotal information I've found on the appview itself suggests that it follows a similar distribution to the PDS, but with far fewer of them so far. I'm looking for a data source, though.
Above the appview layer, I do think feeds and labelers are likely to show interesting decentralization; labelers are probably possible to get actual usage data on, but AFAICT only bluesky can provide data about the use of feeds, and they don't (all that's available is the number of likes, which is not the same thing.)
bsky is a lot newer and has x10 times the users of fedi.
Atproto is decentralised, and if you tell me it isnt please explain wafrn
There is a difference between the protocol enabling decentralization and decentralization actually happening in practice. atproto is designed to be decentralized (well, with the exception of did:plc); the question of course naturally arises whether it is getting used that way or not. Clearly, *at the level of where user data is stored*, that is not currently how the vast majority of its deployment has been done in practice. This very well could change, and the protocol is (again, with the exception of did:plc) designed to facilitate that change. Will it change? I don't know, let's watch and see; that's what the site is for.
But I also want to say that being new and big is actually a potential impediment to atproto becoming more decentralized. Most systems *tend* to trend towards more centralization, and network effects *tend* to keep people in the places where lots of other people are. So, while atproto is *designed* to permit decentralization that does not guarantee that it will *happen*, and starting as one big ecosystem controlled by a single entity makes it *less likely* that it will achieve meaningful decentralization. There are people trying, though - blacksky being the clearest example. Is this the start of an inflection point where more communities spring up and migrate off the centralized infrastructure? Maybe! Maybe not! Let's watch - that's what the site is for.
I was gona say smugly something about matrix but actualy matrix has the same problem.
yet everyone calls it fucking decentralized.
I mean, matrix proves that what you say can become a problem.
@hadronized @jwildeboer In this context, the question I'm asking is whether the *ecosystem* is decentralized with respect to where data is stored.
So, in this case, sourcehut is one participant, as are github, codeberg, gltlab, all the forgejo instances that my data source (Software Heritage) knows about. The question being looked at here is whether user data (in the git world, repos) are concentrated on a few participants or spread out across many.