Mastodawn

Sean Heber Mar 19

It seems to me that most people, when wrapping up a big project that consumed their life for awhile, get excited, happy, and want to throw a party.

I, on the other hand, get weirdly depressed and dismayed. I question if it was worth it. I wonder if anyone will care or notice. I fear I wasted my time or that it was pointless. It's sort of a depressing time, mentally, in a strange way. I try to fight it, but the darkness is there anyway.

So anyway, just finishing up a huge Tapestry code project.

Sean Heber Mar 19

And before you get excited, it's not very user-visible. And that's why it hurts, maybe. It's important. It'll help in the situations it was meant to solve. But will you notice it day-to-day? I dunno. Ideally not if it's working correctly, anyway.

If you write 13,000 new lines of code in a forest and no one is there to review it, does it make a feature?

Sean Heber Mar 19

It was probably the most complex thing I've ever built.

I removed CKSyncEngine and did it all myself solving the problems we needed solved. Maintaining a synced database of items with unique IDs supporting cascade deletes and automatically handling conflicts without much participation from the server which knows nothing of our needs.

Do you know how hard that is? No, most people don't and never will.

Do you know how many ways it can go wrong? So many ways. Have I found them all? I don't know.

Sean Heber Mar 19

It's not just a replacement for CKSyncEngine - it's much more. One of the key problems we had was how CloudKit replays deletion tombstones going way back. Tapestry would download feed items, make a CKRecord, and put it in iCloud to sync to other devices. Then later when the item gets old, we delete them. Over time, with busy feeds, you're looking at hundreds of expiring items per day for some people.

If you reinstall the app, it'd replay those deletions as CloudKit "catches you up" on new sync.

Those replays are a total waste of time but it's just how it works. It could take *hours* to restore the relatively small number of actually current records as CloudKit replays dead records the new install never had in the first place. Thousands and thousands of them.

I had to rearchitect *everything* to redesign it around this one unchangeable behavior.

Sean Heber Mar 19

The important insight is that these replays are per-zone. So if you never delete anything in a zone, you won't have to suffer through the replays.

Except... we have to delete stuff or else iCloud will fill up with useless old records the app doesn't even care about anymore.

A connudrum.

My solution is using two zones. One keeps a manifest of current record references in CKRecords that are never deleted - only edited. Buckets. The other zone has the data.

Sean Heber Mar 19

Except CloudKit doesn't allow atomic updates across zones.

So now I'm writing a database.

In the cloud.

Each request to each zone can fail. Each can need retries. The internet could fail between the two writes. Reading might work in one zone today but not the other because iCloud is evil and mysterious.

And don't even get me started about the rate limiting and batch and record size constraints.

At any moment the user might quit the app. Or it might crash. Or they might drop it in a lake.

Sean Heber Mar 19

While also imposing our constraint of one unique ID per type of record so you don't get duplicates.

And at any time you might have 2.. 3... 4 devices.. all syncing and refreshing at the same time wanting to insert new records - some of which might be duplicates.

Madness. It consumed me. For weeks. I am tired.

Sean Heber Mar 19

You might wonder why bother syncing all the items? And yes, valid point and one I've frequently regretted, but even without them the problem is still there - just perhaps less.

And less of a problem is still a problem. So IMO it's moot. It needed solving either way.

Sean Heber Mar 19

And if I sound confident I've solved it with this massive project, well, I'm not. I'm still worried. This comes with a data migration I wish I could avoid. It comes with 13k new lines of code that are not yet battle-tested.

But the only way out is through.

Sean Heber Mar 19

Another thing you might wonder is, why do other apps not have this problem?

Most similar apps pull from a central location - all from one server. If they're trying to do like Tapestry pulling from multiple sources, they often put that logic on their own server. They consolidate it there. They own it.

We're not doing that. Your device fetches the articles. It owns them. It only uses iCloud to sync. If you turn off iCloud, Tapestry still works.

We don't have your data either way.

Sean Heber Mar 19

It's a design choice that made things much harder. Was it the right choice? I don't know. But it was the choice. It's more difficult this way, but it's also more satisfying. Does that make it better? Maybe for some people.

Duncan Babbage Mar 19

@bigzaphod damn, well I’ll just say I understand exactly the problem, I’d love to have a solution in hand that solved this problem, and over here I’m *suitably impressed*.

NMDoerner Mar 19

What a truly herculean task, amazing! Thanks for sharing that, it is good to read such things.

I know exactly what you're feeling, working on a more speedy redesign of the #NeoFinder database format, which has similar requirements...

Congratulations! Treat yourself something really nice, you more than deserve it!

James Baker 🍁Mar 19

@bigzaphod As someone frequently surrounded by people afraid to make *any* choice, at least you did it! And probably learned a lot from it.

@bigzaphod i for one appreciate all the work you put into Tapestry as the app has made my social media life a lot better…

Monty Hayter 🇨🇦Mar 19

@bigzaphod On whether it’s the right choice, I can say this much: from my perspective, anything that minimizes or eliminates having user data in your hands is the right choice.

Sometimes practical realities get in the way, but it’s always the right direction. Even though many people seem not to care and have given up on privacy.

@bigzaphod I think folk who use your products love your quality and integrity. A lot of people just expect things to work, but there are always those who appreciate elegance in the functionality and understand that simplicity does not mean easily-done.

Jurgis Kirsakmens Mar 19

@bigzaphod We do this not because it is easy. We do it because it is hard.

@bigzaphod So basically the device is the server? And then that makes sure everywhere else is "up to date" at that time?

Umm.. That sounds hard.

Chris Silverman 🌻Mar 19

@bigzaphod This whole thing was really interesting. Thanks for sharing.

I'm fascinated by this under-the-hood stuff. I wish more devs got into the weeds on the invisible stuff.

Ged Maheux Mar 19

@bigzaphod So says Deanna (and Reg)! 🖖

Chris Alemany🇺🇦🇨🇦🇪🇸Mar 19

@bigzaphod I am very appreciative of all your work and toil! While users might never notice the thing you did specifically... they'll always notice that the app works and you'll know it works because you did it! That's worth celebrating!

@bigzaphod Wow 🤩 amazing! Best of luck!

@bigzaphod I think the suggested approach for this is to do a first time straight fetch out of the zone, then grab changes but ignore the replay data until it catches up.

Sean Heber Mar 19

@paul yeah, I considered just doing it that way, but you still get fed the replay data which is pretty dumb and still takes a long time. I don't know why they don't embed a date of creation or something into the change token. When you make a new one, you don't need any replay that predates it. Seems like that'd be an "easy" fix for this. But 🤷‍♂️

@bigzaphod I'm sure, hoping there are reasons its done this way, but no idea. You can I think trim the amount of data that comes in the replay, but don't remember details.