So... There seems to be a common misconception where people assume all of Mastodon has 180,000 users because mastodon.social says it's home to that number of users.

The real number is closer to 1,5 million. Those 180k are only mastodon.social users and there are like... 3,000 other servers?

I wonder what I should do. 🤔

@Gargron pull account number?

Add?

@thegibson Pull from where tho. Some central aggregating service, one of the few that exist right now. What if it's down? What if it's being slow?

Using total row count of accounts table might be more reliable and more in spirit.

@Gargron @thegibson Maybe have instances whisper information to one another about their user counts--and the user counts they know about--to other instances once a day or so? Two degrees of separation ought to be enough for m.s to estimate a rough head count, no?

Smaller instances, meanwhile, would see a much smaller observable aggregate count, but that'd be interesting too.

@beadsland @thegibson It could be done... The question is, *should* it be done that way. Hmph.
@Gargron @beadsland @thegibson nice way to produce unmanageable mess

@saper @Gargron @thegibson It would avoid reliance on a single centralized service.

Not any more unmanageable than a game of whisper-down-the-alley. (Hence my use of the term.) Thus, not going to provide a painstakingly accurate sum—certainly not as observed from less connected instances—but when you're dealing with ~1.5 million horizon, long tail can fall off without losing significant digits.

"Here's my count, and counts of every server that polled me for toots today" hardly implies mess.

@saper @Gargron @thegibson Alternatively, a store-and-forward network a la Usenet could be used to share small instance-reported statistical packets, without the worry of losing the long tail.

This would eliminate the need for any given instance to reconcile the counts coming in over alternate routes—as the Usenet-esqe messages would self-reconcile on GUID. The trade-off is storage consumed by storing-for-forward those stats-packet messages.

@beadsland @Gargron @thegibson oh yes and we should add spanning tree to prevent loops!

@saper @beadsland @thegibson No need for recursivity you can just fetch stuff from each known instance in a flat manner.

I don't like the approach for another reason, managing these counts and fetching this data is like a side quest for Mastodon's core functionality, I don't feel good about mixing them in one codebase.

@Gargron
Maybe it's a more general tool for admins that should be part of Tootsuite to help admins, but not part of Mastodon.

@thegibson @beadsland @saper

@Gargron @saper @thegibson Ah, well, if you have a known list of instances, then that works. I'm working off the assumption that there isn't a master list, so the only way to glean that information is via federated neighbors.

I'm also working off the assumption that you don't want a central point of failure. Hence suggesting alternatives that don't rely on polling from (or pushing to) a single point that could be offline or slow, as you note.

@saper @beadsland @Gargron

I think a simple count of the lines in the DB would suffice...

No need to get complex for something so simple.

@thegibson @saper @Gargron Right, but the question is how to obtain the counts from a sufficient space of 3,000ish servers to produce an aggregate total.

It was suggested having that done from/to a single point (either polling or pushing) could be problematic. Hence my spitballing.

@beadsland @thegibson @saper Obtaining counts was just one of the suggested solutions... Improving the wording on existing stats, moving/removing the stats, adding some kind of link, adding stats to joinmastodon.org instead and hoping that solves the issue, there are many alternative ways

@Gargron @thegibson @saper Well, if you really are dealing with numbers near 1.5 million, that number won't need to be updated all that often. (Unless growth is really pronounced.)

Easy enough to just have an old McDonald's style banner proclaiming "1.5 Million Accounts Federated" and update the static number therein by hand as needed.

@beadsland @thegibson @Gargron I think multicast subscription and routing for hashtag messages is much more interesting issue
@saper @beadsland @thegibson Hashtags are probably solved by the relays, depending on how sophisticated of a solution you're thinking of. Relays are a brute force approach
@Gargron @beadsland @thegibson there are interesting lessons to be learned from the evolution of IP multicast routing protocol at IETF, I hope to find some pointers.

@saper @Gargron @thegibson This is why I suggested only two degrees of separation. Anything beyond that becomes computationally expensive.

Granted, two-degrees of separation only work if there's enough interconnectivity across the fediverse as seen from m.s for that to be representative. If there's another sizeable Mastodon cluster that can only be reached from three or more degrees of follow linkages, the whisper approach would fail.

@beadsland @Gargron @thegibson I am quite happy with the status quo, there are more pressing issues to work on I think.
@saper @beadsland @thegibson Right, I am more interested in cosmetic improvements to the way this information is conveyed, rather than solutions that require new standards etc
@beadsland @saper @Gargron @thegibson Eric Ried of DEC used to produce Usenet usage reports in the late 1980s / early 1990s