This seems very important and worth ongoing study:

“Once again, results suggest a rise in diversity as the 10 biggest server contribution to the Fediverse is reduced by more than 10%. So, even if the biggest servers are accumulating more users, it seems that the Fediverse is becoming more decentralized.”

@fediversereport @spreadmastodon @fediversenews

https://socialhub.activitypub.rocks/t/analysis-of-fediverse-diversity-in-terms-of-decentralization/3252

Analysis of Fediverse Diversity in terms of Decentralization

Hello! Following a previous analysis (!it’s in catalan language!) and recovering the interest on the Fediverse, I’ve extended my analysis focusing in software diversity in that case. The two time points analyzed are from September 2022 and May 2023. In the initial analysis above there is a mistake. The dataset used is obtained using script based on this code written by @spla. Before getting into the analysis itself, I want to state that the active users measure is somehow confusing. Some ser...

SocialHub
@tchambers @fediversereport @spreadmastodon @fediversenews Interesting! Although there are some quirks in the data, with joindiaspora and diasp.org (neither of which are Mastodon) in last March's accounts and not the current list, and with mastodon.cloud and gc2.jp going from over 10% of MAU to not appearing at all in the latest statistics.
@jdp23 @tchambers @fediversereport @spreadmastodon @[email protected]

Also, it'd be nice to know something about what comprises all the "others", how many accounts do those instances have, how many of them are there?

Otherwise, it'll be interesting to track this going forward because mastodon.social right now is growing faster than it did between March and 17 May ... the picture could very well look different when comparing May to August.
@maegul @tchambers @fediversereport @jdp23 @fediversenews @spreadmastodon

>Also, it’d be nice to know something about what comprises all the “others”, how many accounts do those instances have, how many of them are there?

Others means all the rest! Which means 21089 in May (as shown in the first table).

>Otherwise, it’ll be interesting to track this going forward because mastodon.social right now is growing faster than it did between March and 17 May … the picture could very well look different when comparing May to August.

Totally true! I would like to take monthly pictures (with the help of @spla, which is the author of the API query script).
@[email protected] @tchambers @fediversereport @jdp23 @[email protected] @spreadmastodon @spla

Thanks for the reply!

Any chance others can get their hands on the data set?
@[email protected] @tchambers @fediversereport @jdp23 @[email protected] @spreadmastodon @spla also, another question … any insights on your your data set and its creation would differ from any of the others out there like fedidb or instances.social?
@maegul @tchambers @fediversereport @jdp23 @spla @fediversenews @spreadmastodon Hey! That is interesting... I didn't thought in using fedidb (the other one I didn't know). The truth is that @spla took the data by itself and, as I had the chance to look at it, performed the analysis.

It will be interesting to do the analysis with the fedidb dataset. For what I see right now, it seems that it differs from the dataset used by me. I can see an increase in servers in Oct 22 that results in a decrease in Users by server, and then it keeps more or less stable.

I would like to apply the shannon and simpson indexes and the top10 server distribution, as they gives a broather view of diversity.
@maegul @fediversenews @fediversereport @jdp23 @spla @spreadmastodon @tchambers I am playing with fedidb API and I think I could get all the data I need (first time using APIs myself!).
@[email protected] @[email protected] @fediversereport @jdp23 @spla @spreadmastodon @tchambers

Yea the API works well ... I've used it myself. Last time I used it though I think there was an issue in the data not having many of the small (1-10 user) instances. But from the dashboard that seems to have been fixed now.
@maegul @tchambers @fediversereport @jdp23 @spla @fediversenews @spreadmastodon When I find some time, I will try to recover this global data. I find particularly interesting the ecology measures of diversity to be applied to user distribution and software distribution.
@[email protected] @tchambers @fediversereport @jdp23 @spla @[email protected] @spreadmastodon

What do you mean by "global data" ... what are you intending to recover?
@maegul @tchambers @fediversereport @jdp23 @spla @fediversenews @spreadmastodon The same as spla did, software, user and mau data from all servers included in fedidb.

@[email protected] @tchambers @fediversereport @jdp23 @spla @[email protected] @spreadmastodon

Ahh ... right, collect the data yourself.

It does strike me though that it's the sort of task that we'd be better off doing more collectively. We could pool the algorithms to get the best one and collect multiple datasets from multiple origins to maximise coverage which can then be merged.

Also, just in case it's useful, here's my quickly hacked together python code for getting all the data from the fedidb API:

# + from collections import deque import requests as req base_url = 'https://api.fedidb.org/v1/' servers_url = f'{base_url}servers' # - # + server_data = deque() params = {'limit':40} n = 0 while True: if n%5 == 0: print('Loop', n, 'servers', n*40) r = req.get(servers_url, params=params) if r.status_code == 200: d = r.json() server_data.extend(d['data']) next_cursor = d['meta']['next_cursor'] if next_cursor is None: print('Cursor is None ... FINISHED') break params['cursor'] = next_cursor else: print(f'request broke and returned {r.status_code}') break n += 1 # - # + len(server_data) server_data[0] # -

@maegul @tchambers @fediversereport @jdp23 @spla @fediversenews @spreadmastodon Thanks! I'll take a look on that!

>We could pool the algorithms to get the best one and collect multiple datasets from multiple origins to maximise coverage which can then be merged.

That would be cool for sure, although I'm not sure I can add much in the technical part (I am a biologist with some data analysis skills).

@[email protected] @tchambers @fediversereport @jdp23 @spla @[email protected] @spreadmastodon

(I am a biologist with some data analysis skills)... me too!

I recently did some analysis of my own using data from instances.social, but didn't have any historical data to compare to. I'm intending to make a comparison after a month or too.

You can see my analysis here:
https://hachyderm.io/@maegul/110331433071884694

maegul (@[email protected])

Attached: 1 image Graphs of the sizes of fediverse instances, how common they are, and where the most people are! 🧵 Data pulled from https://instances.social/ (by @[email protected]) and excludes pawoo and baraag as they're heavily blocked for good reasons (it seems) Breaking down instances by the number of users into bins (that are quasi human friendly logarithmic), we see that the majority (55%) have 2-50 users, ~33% have 1 user, and almost all instances have less than 5,000 users. @[email protected] 1/

Hachyderm.io