So... There seems to be a common misconception where people assume all of Mastodon has 180,000 users because mastodon.social says it's home to that number of users.

The real number is closer to 1,5 million. Those 180k are only mastodon.social users and there are like... 3,000 other servers?

I wonder what I should do. 🤔

@Gargron pull account number?

Add?

@thegibson Pull from where tho. Some central aggregating service, one of the few that exist right now. What if it's down? What if it's being slow?

Using total row count of accounts table might be more reliable and more in spirit.

@Gargron @thegibson Maybe have instances whisper information to one another about their user counts--and the user counts they know about--to other instances once a day or so? Two degrees of separation ought to be enough for m.s to estimate a rough head count, no?

Smaller instances, meanwhile, would see a much smaller observable aggregate count, but that'd be interesting too.

@beadsland @thegibson It could be done... The question is, *should* it be done that way. Hmph.
@Gargron @beadsland @thegibson nice way to produce unmanageable mess

@saper @Gargron @thegibson It would avoid reliance on a single centralized service.

Not any more unmanageable than a game of whisper-down-the-alley. (Hence my use of the term.) Thus, not going to provide a painstakingly accurate sum—certainly not as observed from less connected instances—but when you're dealing with ~1.5 million horizon, long tail can fall off without losing significant digits.

"Here's my count, and counts of every server that polled me for toots today" hardly implies mess.

@saper @Gargron @thegibson Alternatively, a store-and-forward network a la Usenet could be used to share small instance-reported statistical packets, without the worry of losing the long tail.

This would eliminate the need for any given instance to reconcile the counts coming in over alternate routes—as the Usenet-esqe messages would self-reconcile on GUID. The trade-off is storage consumed by storing-for-forward those stats-packet messages.

@beadsland @Gargron @thegibson oh yes and we should add spanning tree to prevent loops!

@saper @beadsland @thegibson No need for recursivity you can just fetch stuff from each known instance in a flat manner.

I don't like the approach for another reason, managing these counts and fetching this data is like a side quest for Mastodon's core functionality, I don't feel good about mixing them in one codebase.

@Gargron @saper @thegibson Ah, well, if you have a known list of instances, then that works. I'm working off the assumption that there isn't a master list, so the only way to glean that information is via federated neighbors.

I'm also working off the assumption that you don't want a central point of failure. Hence suggesting alternatives that don't rely on polling from (or pushing to) a single point that could be offline or slow, as you note.