With bluesky (mostly) going down for a few hours today, I got to wondering about how decentralized the fediverse really is in terms of where its servers are hosted. I grabbed a server list from fedidb, with network information coming from ipinfo.io .

[EDIT: I did a better analysis on a dataset of 10x as many servers, see https://discuss.systems/@ricci/114400324446169152 ]

These stats are by the number of *servers* not the number of *users* (maybe I'll run those stats later).

fedidb currently tracks 2,650 servers of various types (Mastodon, pixelfed, lemmy, misskey, peertube, etc)

The fediverse is most vulnerable to disruptions at CloudFlare: 24% of Fediverse servers are behind it. Also note that this means that I don't have real data on where this 24% are located or hosted, since CloudFlare obscures this by design.

Beyond CloudFlare, the fediverse is not too concentrated on any one network. The most popular host, Hertzner, only hosts 14% of fediverse servers, and it falls off fast from there.

Here are the top networks where fediverse servers are hosted:

504 Cloudflare, Inc.
356 Hetzner Online GmbH
130 DigitalOcean, LLC
114 OVH SAS
56 netcup GmbH
55 Amazon.com, Inc.
55 Akamai Connected Cloud
36 Contabo GmbH
33 SAKURA Internet Inc.
32 The Constant Company, LLC
31 Xserver Inc.
28 SCALEWAY S.A.S.
24 Google LLC
23 Oracle Corporation
16 GMO Internet Group, Inc.
14 IONOS SE
14 FranTech Solutions
11 Hostinger International Limited
10 Nubes, LLC

Half of fediverse servers are on networks that host 50 or fewer servers - that's pretty good for resiliency.

There is even more diversity when it comes to BGP prefixes, which is good for resiliency: for example, the cloud providers that have multiple availability zones will generally have them on different prefixes, so this gets closer to giving us a picture of the specific bits of infrastructure the fediverse relies on.

The top BGP prefixes:

55 104.21.48.0/20
50 104.21.16.0/20
48 104.21.64.0/20
41 104.21.32.0/20
41 104.21.0.0/20
38 104.21.80.0/20
32 172.67.128.0/20
31 172.67.144.0/20
28 172.67.208.0/20
28 162.43.0.0/17
27 104.26.0.0/20
26 172.67.192.0/20
26 172.67.176.0/20
23 172.67.160.0/20
19 116.203.0.0/16
17 172.67.64.0/20
17 159.69.0.0/16
16 65.109.0.0/16
14 88.99.0.0/16
14 49.13.0.0/16
13 78.46.0.0/15
13 167.235.0.0/16
13 138.201.0.0/16
11 95.217.0.0/16
11 95.216.0.0/16
11 49.12.0.0/16
11 135.181.0.0/16
10 37.27.0.0/16
10 157.90.0.0/16

75% of fediverse servers are behind BGP prefixes that host 10 or fewer servers, meaning that the fediverse is *very* resilient to large network outages.

Top countries where fediverse servers are hosted:

871 United States
439 Germany
156 France
148 Japan
75 Finland
57 Canada
49 Netherlands
38 United Kingdom
26 Switzerland
26 South Korea
21 Spain
19 Sweden
18 Austria
17 Australia
15 Russia
12 Czech Republic
10 Singapore
10 Italy

And finally, a map of the locations of fediverse servers:
https://ipinfo.io/tools/map/91960023-e8c6-4bee-9b07-721f2c8febab

One thing that's interesting to me in this data is that there is actually *much* more consolidation on a few cloud providers in Europe than there is in the US. This is actually somewhat concerning. 81% of fediverse servers in Germany are at Hetzner, 73% of the servers in France are on OVH. By comparison, the most popular American provider, Digital Ocean, hosts only 14% of servers in the US. I don't know how representative this is of cloud usage overall (eg. Digital Ocean is definitely not the top cloud in the US) but it does certainly suggest much more centralization in Europe as compared to the US.
@ricci it's worth noting that the fedidb dataset categorically excludes gotosocial due to political differences
@ricci since gotosocial is specifically designed for the needs of small and individual-user servers, that may skew the data
@ireneista yeah I think the other dataset ended up being better - though it seems not useful for things relating to user counts, as there are many instances listed there with user counts that seem highly improbable
@ricci ah yep, makes sense
@ricci there is not consensus that data collection of this kind is a good thing for society, and given that it relies in part on self-reporting by servers, it's not surprising that there are quality issues. still, the analysis is good!
@ireneista yeah and I have no problem with instances and users who don't wish to be analyzed, that's something we should all have a right to