Mastodawn

marius Oct 31, 2024

Since I'm such a fan of handmade programming, I find myself this fine eve implementing document indexing from scratch(ish) for my #GoActivityPub library storage backends.

Show thread

marius Nov 2, 2024

I'm sure I made plenty of mistakes, but I have to admit I find it surprisingly satisfying to be able to operate on a data type that I can overlay on top of the existing #FedBOX storage engines and get native and *fast* querying for them.

The indexes are quite chunky despite being built on top of roaring bitmaps because there's so many "indexable" elements in an #ActivityPub object. (Currently I'm indexing the type, the content, summary, name, preferredUsername, the recipients, the actor and the object)

As I explore some more, I hope I streamline some of these issues, and make the whole thing more robust.

Show thread

marius Nov 2, 2024

By *native* I mean that I can have my own little API for searching:

Show thread

marius Nov 3, 2024

Frantic day today, around 10h of productive work on improving the Index and moving it as part of the go-ap/filters module.

+1510/-11 lines of which 987 belong to tests.

Coverage is not entirely sufficient yet, because it's missing the checks for the top level Index.Add() and Index.Search() methods.

Another thing left to do is the persistence to disk.

The **reason** why I wanted to move the work I've done yesterday to this module is that instead of the custom client.SearchByX() functions, I wanted to retrofit the existing functionality already present in the filters module. Ah, also moving the bitmaps themselves to a semblance of generic types....

Show thread

marius Nov 3, 2024

The new API would not be terribly different.

Show thread

marius Nov 3, 2024

The full(working) example can be found here: https://pkg.go.dev/github.com/go-ap/filters#example-SearchIndex

#activitypub #GoActivityPub #golang

filters package - github.com/go-ap/filters - Go Packages

Package filters contains helper functions to be used by the storage implementations for filtering out elements at load time.

Show thread

marius Dec 22, 2024

The experiment of using roaring bitmaps as the foundation for indexing #ActivityPub objects is half successful and half not.

The good news is that soon I'll be able to replace the #brutalinks client access to it's activitypub backend with something that's built on top of local storage that makes use of the indexes, therefore being much, much faster.

The bad news is that adding indexing to the storage backends themselves didn't result in too much performance gains, but I just suspect I'm doing something wrong.

#GoActivityPub #golang

Show thread

marius

It's a painful realization that I come to that no matter how much effort I put into making my #ActivityPub server be fast it's still going to suck if in order to build a meaningful page for a user the client needs to do many requests.

So the #brutalinks link aggregator now makes use of asynchronous collection fetching and then content rendering is being done from local storage.

This decreased the loading times to probably less than half of what they were before.

However I still need to find a good model for aggregating and balancing all the sequential loads with an eventual asynchronous sequential sending of activities.

#FedBOX #GoActivityPub