there are two main protocols used for relaying in the fediverse. both have overlapping areas of risk, though I tried to make mine as safe as I reasonably could.
the area where there is overlap is with metadata leakage: if you connect a relay, in all cases, you are advertising the existence of an activitypub object as a specific URI. this is how relays relay: they induce ingestion of external posts as a side effect.
first, let's talk about my protocol for this, the litepub relay extension.
the way that it works is that a server wishing to connect to a relay simply follows an actor on the relay service. the actor then redistributes incoming Announce activities (mastodon calls these boosts) to those following it.
this causes posts to be ingested by resolving the URI in the Announce activity.
the only thing that is signed is the Announce activity, by way of an HTTP signature header covering the body's digest.
this is, in my opinion the right way to implement relaying: you are advertising content at a given location, and then interested servers can go import the posts if they want to.
by comparison, let's talk about the Mastodon relay protocol.
this protocol is basically the opposite of mine: instead of subscribing to an *actor*, you subscribe to an inbox.
Mastodon then sends raw Create and Announce activities to that remote inbox, and the relay server forwards the raw activities along.
instead of HTTP signatures, the raw activities are signed with LDSigs, and Mastodon has no key rotation or expiry model. in other words, it sends activities that are signed by a unique key belonging to your account to these relays, and the signatures are forever.
myself and many other security engineers, as well as actual cryptographers criticized this model for years, and still sometimes do, like now.
the question we should be asking is: how well do these relay protocols resist disclosure to adversarial parties (e.g. Palantir, harassment instances like poast, etc.)?
and to make things interesting, let's make it a poll: which protocol do you think is safer?
alright. threat modeling time.
let's say that you are having a bad day, and you post something that could be considered a threat towards somebody more powerful than you: it could be a boss, it could be a politician, it could be the cop that wrote you a ticket earlier, whatever.
what happens with each approach to relaying?
in litepub relaying, you post your post, and then your instance forwards it to relays.
the only thing forwarded was a URI to a post object.
if you delete the post object, then it's replaced with a tombstone*.
* this isn't perfect because deletes do not federate with perfect reach
this quality is called deniability: if you can make the argument that the post never existed, or wasn't signed by you, or whatever, then they have to work to prove otherwise.
well... in the mastodon scheme, your instance just gave the relays a signed copy of the post object itself. in other words, forensic confirmation that you made the post you are trying to deny.
a lot of people talk about authorized_fetch. for whatever reason, mastodon disables relaying when authorized_fetch is enabled, which is largely pointless.
what does authorized_fetch do? it is a mitigation for the broken security model of activitypub: it forces all requests to be signed by the signing keys of whatever actor is requesting remote content. if the actor is unauthorized to request the content, the fetch is denied.
this works to protect your posts from being relayed to adversaries, either through relaying itself or through adversarial accounts manually boosting the post.
it is vitally important to make clear that authorized_fetch is a mitigation. it is not perfect, and there are other methods, like screenshotting, that authorized_fetch won't protect from. but it does pretty well.
mastodon default-disables this mitigation while other servers largely have it enabled by default.
if I were Palantir, I would set up an activitypub relay and then convince as many instances as possible to sign up for it.
I would then scrape all the posts, and all the signatures.
thankfully, I'm not Palantir.
in conclusion from the admin pov:
1. don't use relays if your instance uses the Mastodon relay protocol.
2. *do* use authorized_fetch.
3. use relays at your own risk, even with the litepub protocol. while authorized_fetch will prevent posts successfully federating to adversarial nodes that have been blocked, it will not prevent those nodes from discovering metadata about your posts.
@ariadne This deniability guarantee is at odds with the principle of replication of data in a federation. With your model, can you ever copy/cache data? If your server dies, are all the posts it hosts inaccessible?
This isn't necessarily a bad choice, but it is a choice and the behaviour needs to be clear.
@ska the instance you are on has been configured this way from the beginning. yes, we cache remote content, but the thing about authorized fetch is
if you exist in a world where all activities have to be fetched with signature, then you can work out exactly which peers have a copy of any given post.
in other words, it opens the door to perfect deletes.
@ska @ariadne My understanding is that auth_fetch disables outbound data signing, but instances can still safely cache what they fetch? By relying on the "implicit trust" of "I obtained this directly from the authoritative server, the connection to which was verified with TLS."
The point being to prevent repudiation by third parties to other third parties using the signature, as instance 3 can't know if instance 2 is lying to it unless it goes and fetches (with auth_fetch) content from instance 1 directly. (And any instance that forwards unsigned activities is almost certainly "malicious" by the community standards here, or at the very least "suspicious")
@stag again this is not about the typical use case, but rather about edge cases that well-behaved social platforms are expected to solve for, like stalking mitigation.
I'm not interested in debating this.
isnβt P2P make it rather hard for discoverability tho?
@xarvos @argv_minus_one @ariadne The server-to-server status quo also has a different kind of discoverability problem. It's the small / solo servers that suffer.
I think DHT (distributed hash table) like what's used in BitTorrent is part of the answer.