this is an excellent summary of the real-life problems - moderation, discoverability, searchability - of a future federated Bluesky AT Protocol network from @jonny

https://neuromatch.social/@jonny/110552684614320107

see also

https://github.com/bluesky-social/proposals/issues/18
https://github.com/bluesky-social/proposals/issues/19

i particularly like the observation that the functions people *want* from social media - moderation, discoverability, search - just straight-up require centralisation.

Decentralisation has its virtues, such as the fediverse ticking along mostly fine while Twitter and Bluesky pooped themselves on Saturday. But for usability for non-nerds, decentralisation is a harsh antifeature - see Mastodon. You can't search your fuckin' friends, I mean wtf, FUNCTION NUMBER ONE on a new network!

Any eventual atproto network will naturally centralise on a big graph server, 'cos otherwise you don't get search or discoverability.

there isn't as yet a central repository of critiques. also the protocol isn't finished yet, there's a lotta vaporware and handwaving.

actual Trust & Safety people of considerable experience, e.g. Yoel Roth from Twitter and Denise from Dreamwidth and formerly of LiveJournal, spent many futile hours posting at length to the company CEO and devs on how Bluesky's plans would make essential moderation functions literally impossible.

even if Bluesky gave a hoot about doing moderation properly, which doesn't seem to occur to them. they seem literally incapable of understanding the question.

some of the devs are getting to understand the problems. because Bluesky pressed them into service to do moderation personally. they understood there was a problem here once they saw some shit.

but basically Bluesky wrote a moderation white paper and fell in love with it, and they are impervious to any idea they didn't think of themselves, or the history of thirty years of internet social media.

like, when you get to "let's make block lists public!" why the fuck are you doing something that obviously stupid? "well the white paper requires it" i mean.

there is no one weird trick to technically scale moderation. you have to do the fucking moderation. with people.

that's *fine* for now - there is no network. bsky.app is a fun single-node server to be on. 200k users, high quality queer shitposters, great userbase!

but it's important to keep in mind that it's *run* by rationalist neoreactionary-leaning blockchain bros who have shown an unfortunate tendency in practice to defend their neo-nazi friends from being kicked off for death threats against minorities.

the technical details are secondary, even if you approach them with an unwarranted assumption of good faith. because atproto was designed with bad assumptions by idiots. it's a historical fact that Jack Dorsey's driving motivation was to make a network nazis couldn't be permanently banned from. that's what he funded these people to do, and the tech is just details at that point.

on Mastodon, Bluesky would have been fediblocked by now just for its nazi coddling.

btw i will definitely be calling Bluesky's wizard white paper idea "compostable moderation" from now on

jonny (good kind) (@[email protected])

so far, #BlueSky / #ATProtocol seems like a federated system the same way Google Alerts is a federated system. - you can self host your website or uses Google sites. - Google crawls you - People subscribe to algos/alerts - Google Alerts emails you the matches

Neuromatch Social

@davidgerard @jonny What I would like, eventually, is a system where mod/discover/search are potentially centralized, but I have my choice of which center to select without having to pick a fundamentally different set of content. (Which opens the door to a "center" which is, like, a tree of centers.)

I doubt BlueSky is that system because I subscribe to the conspiracy theory that BlueSky was specifically designed, before it split off from Twitter, to cement the Twitter servers at the center.

@mcc @davidgerard @jonny discoverability and search are not necessarily centralized by any means. It just requires a different design to enable them in a consensual way across decentralized instances.
@anildash @mcc @davidgerard @jonny
Every time I read «discoverability and search requires centralization» I have to wonder if people failed to learn from Kademlia or are purposely ignoring that the concept of distributed search & discover has already been solved in the past, on even more distributed networks than the Fediverse.

@oblomov
I have no idea what Kademlia is or the story behind it. Can you explain or link me to a reasonably-sized digestible version, so I can educate myself?

@anildash @mcc @davidgerard @jonny

@chargrille

a very tl;Dr oversimplification is that every person or thing gets a big long binary sequence as an address. the bitwise XOR between two binary strings serves as a distance function, so distance between

011001
and
010011
is 2, etc.

to find where an ID is (so eg. finding how to connect to a peer, but it's a general location finding algorithm), you keep asking different peers where some ID that is closer (has a smaller XOR value) to the one you want is. peers keep some list of peers at various distances from themselves so they can say "I don't know exactly where that address is, but I do know one that is definitely closer than I am"

that's a very simple explanation, but you can also store additional information at locations in kademlia space to make more interesting things happen like search, etc. as alluded above

@jonny Thank you. Oh dear. I'm out of my depth. I will ask my other half to explain to me. But I guess I am wondering: don't you need a central authority to dispense the binary sequences/address to be sure that the addresses actually correspond to distance/location? If I'm just completely misunderstanding, that's fine, I'll ask him to try to bring me up to speed.

@jonny

If the addresses are just generated according to some agreed-upon convention, how do you prevent creation of duplicate addresses for different peers?
And what happens if you contact a peer ID, are given a "closer" peer ID from their list, you contact that peer next, but their list doesn't return any closer peer IDs (even though one does exist, it's just not kept on their list). Does your search end there?

@chargrille
a search can certainly fail - the search is only guaranteed to terminate in ideal circumstances where all peers are online, reachable, and properly behaving. otherwise, the odds are still pretty high because of a lot of redundancy built in at different scales. if it fails, you can still get pretty close and maybe resume the search at another time when the peer might be back online. again lots of active research on how to make this robust. DHTs are typically just used for peer discovery/addressing as part of a larger system that might have different incentives to maintain uptime/discoverability.

@chargrille
it depends on what you are addressing! if you are addressing some file, then you could use the hash of that file as the address, and then you would assume there is only some very small chance of an overlapping address from a different file (a hash collision)

otherwise, generating very large random numbers works reasonably well, and you can handle duplicates at a higher "layer" - see if this address is already taken, if not we're good. there are lots of other techniques for handling malicious behavior like purposefully impersonating an address or sending garbage routing data, some based on distributed trust (auto-banning peers that give bad data), others based on not trusting the DHT (it's just an address, but if the thing at the address match what's expected, we treat it like a fake).

the thing with a lot of distributed technologies is that making good enough promises works in a lot of cases, often sort of akin to the "trust but verify" idea. the addressing layer doesn't need to guarantee uniqueness because uniqueness can be handled at a different level, etc. rough consensus and quorum sensing, rather than strict guarantees of functionality, is sort of a hallmark of distributed tech.

they are not perfect! DHTs have a lot of problems, weak spots, misaligned incentives, etc. it's an active field of research :)