When imperfect systems are good: Bluesky's lossy timelines

https://jazco.dev/2025/02/19/imperfection/

When Imperfect Systems are Good, Actually: Bluesky’s Lossy Timelines

By examining the limits of reasonable user behavior and embracing imperfection for users who go beyond it, we can continue to provide service that meets the expectations of users without sacrificing scalability of the system.

Jaz’s Blog
Note that all of this reflects design decisions on Bluesky's closed-source "AppView" server—any federated servers interacting with Bluesky would need to construct their own timelines, and do not get the benefit of the work described here.

What reason does Bluesky give for not opening up their AppView code?

Another notable component that is closed source is the discovery feed generator, where at least there is some reason.

I asked this and got

> We did a backend rewrite from postgres to scylla and it has a bunch of deployment specific stuff, but is functionally identical to the open source postgres version. Its not really a "v2" in terms of new features, we just made it make use of our hardware really well[1]

[1]: https://bsky.app/profile/iame.li/post/3l7e3jfqit22s

Eli Mallon (@iame.li)

Is the Bluesky AppView v2 not open source? Somebody said that to me once and I’ve never been able to find it anywhere

Bluesky Social

Thanks, so are both the Postgres and Scylla versions maintained in terms of new features?

I wasn't aware that AppView v1 was open source, and the most recent info I'm aware of on the topic is https://alice.bsky.sh/post/3laega7icmi2q, https://github.com/bluesky-social/atproto/discussions/2961 and https://docs.bsky.app/docs/advanced-guides/federation-archit..., and everything I've heard about Bluesky was that open source appview is "still coming".

How to self-host all of Bluesky except the AppView (for now) — alice.bsky.sh

by Alice · 3 min read

It's not coming, it never went away… As I understand it, the "business layer" with all the logic is above the data later, shared by the Postgres and Scylla versions, and the data layer just makes queries to the database. I think they are using the Postgres version locally for development.

Anyone following hundreds of thousands of users is obviously a bot account scraping content. I'd ban them and call it a day.

However, I do love reading about the technical challenge. I think Twitter has a special architecture for celebrities with millions of followers. Given Bluesky is a quasi-clone, I wonder why they did not follow in these footsteps.

You don't need to follow anyone (or even have an account) to scrape content… Someone following a huge amount of accounts usually wants to get a lot of followers quickly this way through follow-backs.