Hi @piotrsikora. Great to see that kbin is responsive again and returning to usability. If possible, could you please give an update on what is going on currently with federation? It looks like some things are getting through (e.g. I can see this thread on reddthat) but threads from most lemmy instances are not showing up in a timely way in /newest still and at a quick glance it looks like communities in my collections are maybe a half day behind -- with many threads from the past week or more missing entirely.
I'm assuming some of that may be on the lemmy side -- 0.19 has a major issue with sequential message distribution as seen with lemmy.world <-> reddthat.com federation (see this bug report and this comment if you're unfamiliar) -- but it'd be best to hear from someone who has access to the infrastructure about what's going on rather than guessing.
In particular, it'd be helpful to know:
Also, should we @ you in addition to @ernest if we encounter problems on kbin.social?
Thank you!
Requirements Is this a bug report? For questions or discussions use https://lemmy.ml/c/lemmy_support Did you check to see if this issue already exists? Is this only a single bug? Do not put multipl...
@e0qdk
I can only check federation with our mastodon server (pol.social), and now is very quickly (about seconds).
Queue on kbin.social is empty, so everything should work instant. Maybe lemmy.world add some limitation due to problem with kbin.social.
Feel free to add if you encounter some problems with infrastructure ;)
I can only check federation with our mastodon server (pol.social), and now is very quickly (about seconds).
Checked with mastodon.com.pl and 101010.pl - both received my boost in seconds.
There was a bug with KBin some days ago where it would send huge amounts of federation activities to Lemmy instances which would overload them. To mitigate this, lemmy.world and some other instances had to block kbin.social. I believe the .world admins tried to get in contact with @[email protected]. You can read some more details here.
I would also suggest that you and Ernest join the Lemmy Admin chat on Matrix where this problem was found and discussed.
I though I’d better make a post about recent lemmy.world federation issues, since not everyone sees the daily posts. TL;DR: - Lemmy.world inbound federation is about 18 hours behind, and slowly catching up. - Outbound is fine: If you post or comment on something in a lemmy.world community, it will be almost instantly federated to other users, but if they reply to you then it will take 18 hours before you see it. - It was caused by an assumed Kbin bug in combination with a lemmy bug - All other instances seem fine for inbound and outbound federation On Thursday last week Lemmy.world began to receive hundreds of thousands of “activities” (actions - comment, post, upvote, etc) from Kbin to their communities. These actions then federate from Lemmy.world out to any instance with users subscribed to those communities. Unfortunately, this also helped uncover a big issue with how Lemmy handles inbound federation. Actions are sent one at a time, and the next can’t be sent until the last is done. This means the number of activities your server can receive depends on the latency between the sending and receiving server. In our case, the fact we are hosted in Auckland and Lemmy.world is hosted in Finland means that we can receive about 4 or 5 inbound activities from them per second. When the Kbin issue happened, suddenly Lemmy.world had hundreds of thousands of activities to send to us in addition to the normal inflow of content. We got way behind in terms of Lemmy.world content coming to us. At the peak, we were around 28 hours behind, about 550k activities behind. We are now sitting at around 19 hours behind, and will likely remain that way until this afternoon, when it should improve some more (it’s now peak time on the other side of the world). Here is a graph of the progress. Aussie.zone has a similar problem to us, but is closer to Europe where lemmy.world is hosted so the latency issue is less pronounced. We are one of the worst affected instances due to how it’s hard to get further away than where we are now. graph of lemmy.nz and aussie.zone showing how many activities each are behind, over the past 3 or 4 days. Huge jump, peaking at 550k, then slowly coming down, lemmy.nz about halfway down (250k), and aussie.zone finally reaching the bottom, to be only a few hundred behind (seconds or minutes of delay instead of hours) [https://lemmy.nz/pictrs/image/d3863c10-6a70-431e-b0da-f6158539e912.png] As you can see, with Aussie.zone being a bit closer, they were hit hard but not as hard as us, and have now managed to recover. We are still probably days away from being back in near real time federation with lemmy.world, but each day does seem to improve quite a bit. Other than picking up the server and moving it to Europe, there doesn’t seem to be a lot we can do except wait and hope the lemmy bug is fixed before lemmy.world grows so large that users are creating content faster than we can receive it. There have been some suggestions for short term band-aids. One is that we could set up a proxy in Europe, and filter out certain traffic. For example, we could prevent downvotes being sent to reduce the amount of traffic coming our way (since everything is sent one at a time, as per the ActivityPub protocol). I’m not super keen on doing anything that gets us out of sync with the others, though. We could of course move hosting to Europe, but that gets a lot more expensive (since we’d have to pay for it instead of being free-loaders) and also means the site would feel more sluggish, since then that latency would be for users on every page load instead of the behind the scenes federation. For now we are slowly getting back in sync with lemmy.world, and as long as there aren’t more Kbin bugs then we should be ok for now. Unfortunately doing things like blocking Kbin.social doesn’t seem to help, as lemmy.world still tries to send us the content from Kbin.social users. i.e. it’s rejected by our server, not prevented from being sent, so it still takes a slot. Any questions, suggestions, or concerns, reply to this post!