Mastodon stampede. "Federation" now apparently means "DDoS yourself." Every time I do a new blog post, within a second I have over a thousand simultaneous hits of that URL on my web server from unique IPs. Load goes over 100, and mariadb stops...
https://jwz.org/b/yj6w
Mastodon stampede

"Federation" now apparently means "DDoS yourself." Every time I do a new blog post, within a second I have over a thousand simultaneous hits of that URL on my web server from unique IPs. Load goes over 100, and mariadb stops responding. The server is basically unusable for 30 to 60 seconds until the stampede of Mastodons slows down. Presumably each of those IPs is an instance, none of which ...

@jwz MariaDB/MySQL cope so bad under high load it's insane. Maybe having some sort of "staticizing" mechanism to snapshot the dynamic content and then serve it through nginx with some fine tuning would help? (compression? connection reuse? cache-instructing headers?)
@lucent Again, I don't really need you leaning over my desk and saying "You know what you OUGHTA do", thanks.
@jwz @lucent we're only doing this because it's so shocking. Like, you're a bit of a legend and we did not expect not only for these issues to knock your blog over, but for you to be so salty over it.
nah, the stress is understandable. @jwz copypasted the same message to all people that replied because everyone was pushing the same solution in different flavours.
the issue of fediverse instances pinging as soon as the post gets forwarded is legitimate, but hard to tackle (i.e. would you trust a pre-crawled preview coming from another server?).

@lucent right, so if he has a lot of followers, he's on a lot of home feeds so these are at least mostly legitimate. I know I clicked as soon as I saw it.

My problem isn't that he's surprised by a traffic spike, but that he's trying to make it Mastodon's problem. He should own his own setup and stop pretending like 1000 hits is some sort of DDoS. It's ok to say "I got knocked over, I need to consider caching" instead of "I got knocked over, fuck you for visiting" which is how this comes off.

@lucent Also MariaDB/MySQL default to an un-tuned state, so if he changed a few defaults he'd probably get another result.

Imagine a MariaDB getting knocked over by 1K queries/sec, that's a sad as shit MariaDB.

But he blocked me out of butthurt so 🤷 good luck have fun

You'll know too for sure that depends always on the "weight" of those queries. Resource "heavy" stuff like Wordpress getting hammered by bots exhausting your query pool is quite the bad experience.

Last website I knew it had to cope with massive peak traffic I just asked the other people working on it to have some sort of "static exporter" instead of having yet another WP instance so I could have very aggressive caching in place. I still saturated my port at the provider, thank god it's not 2010 anymore when bandwidth was metered 
I get your point of view, but as I said I get also being pissed at stuff crashing because of Fediverse software pinging you back as soon as your post lands on an instance. Fortunately nothing bad happened, I'm not hurt, I apologized for being the "yet another guy that posted the same solution".
Needless to say though, complaining to the bubble without bringing the issue up to either the mentioned software devs or to the W3C pushing for a standard to deal with this situation, obviously makes the whole rant void. Bigger websites or more aggressive setups would easily cope with the average Fedi requests.

@lucent yeah I mean who among us is above getting pissed off and blaming that stupid hyperactive microservice for our problems.

But as an Sr SRE, I don't have time for recriminations. I assess the situation and find ways *I* can move forward rather than shouting at the clouds. It's up to *me* to answer to why there's no backpressure strategy.

Anyway if you're not blocked, maybe mention the database thing. It's not caching, so maybe it'll be helpful. That's abnormally weak for MariaDB.

I mean, "mysqltuner.pl" is a search result away from any search engine and for sure points out solutions good enough to counter performance issues *that bad*, I think and hope that it wasn't ignored as a point of failure.
@lucent hope against hope, amiright? do they still do mysqlbouncer?
@alexhammy209 probably not. Anyways mysqltuner is still actively maintained (last commit 25 days ago) and supports MariaDB and Percona too, including their specific DB engines. Always worth a shot when fine tuning on a lazy day.
@lucent oh shoot a bit of googling confirms, mysqlproxy is ⚰️
@alexhammy209 @lucent you’re reading into it. Your assessment says more about you than him. “I don’t need anyone to tell me the solution thanks” is not the same as “fuck you for visiting”
@mitka not for commenting here, but for visiting jwz.com. I can definitely sympathize and understand being mad at people pointing out simple solutions I already knew about but neglected to implement. @lucent
@alexhammy209 @lucent ig what’s the diff between mastodon and 100 RSS feed readers? ig it’s just that it’s a thundering herd, rather than spaced out by random chance of polling..

@calebjasik We call it the "single throat to choke" principle. If you have to blame someone for your problems, it's best for it to be centralized.

@lucent

@calebjasik @alexhammy209 @lucent seems like a cluster of Masto instances fetch in a manner that a cluster of feed readers aren’t https://better.boston/@crschmidt/109412294646370820
Christopher Schmidt (@[email protected])

Fun fact: sharing this link on Mastodon caused my server to serve 112,772,802 bytes of data, in 430 requests, over the 60 seconds after I posted it (>7 r/s). Not because humans wanted them, but because of the LinkFetchWorker, which kicks off 1-60 seconds after Mastodon indexes a post (and possibly before it's ever seen by a human). Every Mastodon instance fetches and stores their own local copy of my 750kb preview image. (I was inspired by to look by @[email protected]'s post: https://mastodon.social/@jwz/109411593248255294.)

Better Boston
@alexhammy209 @lucent no, it's saying "I only have 4k followers, rendering a preview and no other page views shouldn't even knock over a raspberry pi."
There's a clear fix though. The preview should be generated from the server that published it in about 5 sizes or so, and the ActivityPub server should serve it with the post metadata.
@lucent @jwz @alexhammy209 Large parts of Twitter run on MySQL, I doubt it cannot handle your load ;-)
yeah, that has been brought up already with all the "mysql-tuner" chat, some fine tuning would for sure help. doesn't change the fact that each Mastodon instance pings the link to crawl the preview and causes most of the spike ultimately having MariaDB to fail.
@lucent @jwz @alexhammy209 Isn’t federation built on (conditional) trust? Yes, trusting server A to properly represent the words of users on server A is slightly different from trusting server A to give you a preview of third party server B. But not THAT different.
Sorry for adding stress to the situation, just checked the various other replies reaching my instance and I've been yet-another-one adding to the pile of workarounds to an issue that should be tackled on the instances' side too