#ActivityPub developers only please: how many items should be in a full collection page?

#EvanPoll #poll

Around 12 or fewer
10.7%
Around 20
30.4%
Around 50
32.1%
Around 100 or more
26.8%
Poll ended at .

@[email protected] As an opinion that is likely to be very unpopular... the page size ideally should be set by the client. Only they know their resource makeup and ability to process the returned information.

Otherwise, we tend to run page size between 60 and 100 depending on the content. And if it's less than 100 entries (and especially when returning an id-list rather than a list of activities), we'll usually just send them out without paging.

The desire is to balance resource usage to get the highest rate of information transfer, and those are the only levers we have available, and they (currently) aren't settable by clients, so it seems the best we can do is default to "large chunks".

I admit that I'm not fond of the page size of 12 that I found in Eugen's followers list of over half a million entries (some years ago). That's a lot of network requests and makes their clients work a lot harder than they need to.

@macgirvin the 12-item page size is a real kick in the teeth.
@macgirvin oh, but: I disagree about configurable page sizes. I think pages should have stable contents, have last-modified dates, and be easily cacheable. It makes traversal and synchronization much better.
@macgirvin like, only the most recent page should be volatile. Except for deletions, older pages should not change.

@[email protected] in reverse chron every page changes when a new item is pushed to the stack.

Even in chronological sets if you need to account for deletions you've basically given up already because there's no immutability guarantee! This is why caching headers exist, no?

@julian that's the way to do reverse chron that messes with caches!

Instead, number your pages from oldest to newest. So page 1 was the first page created. After PAGE_SIZE items have been added, create page 2, make it the `first` page, and now page 1 never changes (unless one of its items gets `Remove`d). All your volatility is in the most recent page, and older pages rarely change.

You can also use UUIDs or other IDs for pages.

@julian the downside is that your `first` page is rarely full. You can get around this by having the `first` page be up to 2 * PAGE_SIZE in length, and shifting PAGE_SIZE items to a new page when the `first` page hits its max.