Mastodawn

mcc May 1, 2024

There is an interesting article titled "Please Don’t Share Our Links on Mastodon: Here’s Why!" about the startling load that Mastodon's mass-distributed link preview generation has on small independent webservers. But I cannot link it to you, because of a reason

Mark T. Tomczak May 1, 2024

@mcc Is it worse than getting Slashdotted?

@mark The problem is it's automated, because the servers all contact to ask for the link preview at the same time

Mark T. Tomczak May 1, 2024

@mcc Oof, that's an interesting challenge.

Also feels like a hole in either Mastodon's use of Fediverse or Fediverse itself. If node A is cloning posts to node B, it's already generated a preview and should clone that too!

robryk May 1, 2024

It's a terrible idea to trust that preview though.

Mark T. Tomczak May 1, 2024

@robryk @mcc In what sense? The preview my personal node generates can also be a lie because the server can inspect the source requester and change the output depending on who's asking.

robryk May 1, 2024

In the sense that someone other than your client, your own instance (both of which you kind of need to trust anyway), and the actual site that's linked to (who's the source of the content, so the preview must trust it) can manipulate it.

The site showing different contents to different users is another issue that I agree exists and can cause similar problems _for malicious linked-to sites_. For nonmalicious ones consider e.g. a post expressing outrage at something bbc published with a link to the "article" on bbc with a helpful "preview".

Mark T. Tomczak May 1, 2024

@robryk It may be just personal preference, but it seems an odd place to draw the line of trust at "I trust this other node to tell me what posts its users made and the images they uploaded but not the link previews it generated and cached."

robryk May 1, 2024

Huh, I'm very surprised that you find this line odd (I don't think I've seen this opinion in the past). I would appreciate if you answered a question or two so that I can understand it better (but do understand if you don't wish to).

The reason I find this line very natural is that I think in terms of which node is intended to be able to speak for which entities, especially that those entities are named in a way to remind us of that relation (domain in URLs, domain/instance part of a fedi ID). Do you think that it makes more sense to keep track of a more vague trust (as in, "that node is rather trustworthy") in general, that the mapping between nodes and entities is insufficiently natural, or something else I can't easily see?

Mark T. Tomczak May 1, 2024

@robryk Not in general, no. I think there's a very practical special-case reason to bend the simple model of trust in this case: too many nodes hammering a site can result in that site deciding that Mastodon is a threat to quality of service and doing their best to block every node.

That's bad for Mastodon as a Fediverse project (and, indirectly, good for the Twitters of the world... "Hey, we may have lax moderation, but we'll only tap your server once to build a preview link").

In terms of cleanest-model, I agree with your assessment of what should be authoritative. In terms of a cost-benefit tradeoff of most-damage-a-modified-link-preview-could-do vs. most-damage-distributing-the-build-of-the-preview-could-do however...

(I'm reminded of DNS, and the fact that while people don't like caching and what it does to the cleanliness of the domain-ip mapping, we put up with it because the alternative would be an untenable noise-mess of popular services' DNS authorities getting hammered. No caching would be cleaner, but there's a reason DNS entries are cached.)

mcc May 1, 2024

@robryk @mark You could imagine manually configured chains of trust, or for example creating three independently administered preview servers and only accepting previews if they are identical between all three. It is a solvable problem

Mark T. Tomczak May 1, 2024

@mcc @robryk I think I'm also going to look into my server config and see if I can just kill the feature.

I've never actually needed or wanted link previews, in any social network. I have a browser and middle-click for that.

Greg May 1, 2024

@mark @mcc you cannot (by default) trust the link preview provided by your peer, as they may alter it without your knowledge. yes, the destination site itself may alter output based on requester, but that's a different problem than the "malicious relay" one.

there are some solutions - a trust system where you take some servers' previews as gospel, or maybe the preview comes with a hash that HTTP HEAD can be used to verify (much cheaper than getting the whole page and preview), or pooling a cache for mastodon users e.g. what https://jort.link/ does

jort.link - a solution to fediverse request floods

A URL redirector and shield to solve fediverse request floods.

Mark T. Tomczak May 1, 2024

@greg @mcc If a peer starts effing with the datastream, I defederate them.

That's the tool for the job. "Mucking the previews" ought to be considered modulo-equivalent-malicious to "hosting Nazis" (assuming we had the feature).

I mean, I'm already trusting them not to muck other people's posts, right? To not slip ads in? To not do all manner of nasty things when they forward data to my node?

Greg May 1, 2024

@mark @mcc @mark @mcc I guess that requires you to know that the malicious peer is doing it - and how do you know that, without visiting the original site to check...

EDIT: a peer can't alter someone else's post in transit due to HTTP signatures incl. message digest, so you have a reasonable expectation that the message you got is as the originating server wrote it - whether THAT server is playing games or not is, again, beside the point and solvable easily with blocklisting.

I guess link previews could be considered part of the original message and covered by the first Mastodon server to put a link up, which basically shifts the burden onto the mastodon operator instead of the website owner. This would require some extra changes to ActivityStreams or at least the fields most Fedi systems use in it. (iirc mastodon has only attachments, urls, bold and paragraph support)

mcc May 1, 2024

@greg @mark I would simply introduce social and technical systems to prevent this

Greg May 1, 2024

@mcc @mark hey now, let me armchair dev a bit cmon

Daniel O'Connor May 1, 2024

@greg @mark @mcc if the peer is mucking with the preview couldn’t it just as easily muck with the link itself?

DJ Sundog May 1, 2024

@mark @mcc the argument against forwarding a pre-fetched and rendered preview card is trust - can you trust every server in the fediverse to fetch and render a true and accurate representation of the preview card for every link? if you cannot, and you are not willing to accept the risk of forgery or misrepresentation, then you have to fetch and render at each node (so the argument goes - I am not a proponent of this argument).

Mark T. Tomczak May 1, 2024

@djsundog I just don't think that argument holds water, especially when the alternative is "every node independently queries the source machine."

I'm trusting peers to give an authoritative accounting of posts from users. Is trusting their preview computation a bridge too far?

I hope not, because consolidated social media doesn't have this problem from a technical standpoint, and that makes it a lot friendlier to web hosts than the Fediverse.

DJ Sundog May 1, 2024

@mark I concur, and just made that argument in another post ha - https://toot-lab.reclaim.technology/@djsundog/112367639796872157 - I rambled a lot more than you did haha

DJ Sundog - from the toot-lab (@[email protected])

Content warning: sundog's hot take on fedihugging

reclaim.technology