today I'm thinking about how floating point numbers can be treacherous -- what are specific examples of when they've betrayed you?

so far I have:

* money calculations where you need to get the exact correct answer (you have to use integers, not floats)
* twitter's tweet IDs are bigger than 2^53 and can't be represented as a Javascript number, you have to use a string

I'm looking for specific examples of what code you were writing that broke, not general facts about floats

this floating point post had so many great replies that I had to write a custom Python program to download them all so that I could really read them properly

(well, you could argue that I didn't technically *have* to, but programmers gotta program)

downloading mastodon replies to a post is surprisingly hard, I ended up just downloading all my mentions and statuses and then writing a recursive CTE in SQL to get all of the replies to a given post.

There's a `/context` endpoint that returns replies to a status but it doesn't return all of them and it doesn't seem to support pagination.

@b0rk @huitema developed a Mastodon crawler to do that (raising interesting privacy issues).

@b0rk

I'm surprised by this. Each post has a replies collection that you can fetch. Fetching collections is slightly annoying (because they're usually paginated and paginated collections in ActivityPub work by each page providing the link to the next page), but is not too bad.

Is this more annoying than I realize or does that simply not work?

@robryk what's the specific api endpoint you're talking about? if it's `GET /api/v1/statuses/:id/context`, it didn't work for me (no pagination)

@b0rk

There's no fixed URL. The URL can be read from the `replies` field of the post itself: https://paste.sr.ht/~robryk/40f933d4009d3ee0ade413ae7c0163da5db74dc8

40f933d — paste.sr.ht

@b0rk It's not a Mastodon, but an ActivityPub API.
@robryk thanks, that looks promising! I haven't learned about ActivityPub APIs yet

@b0rk

The appropriate part of the spec for this particular thing is https://www.w3.org/TR/activitystreams-core/#collections, in case it's helpful (it describes all the weird quirks where field can be a string with the url of the object, a Link object that describes the same, or the object inlined which then might or might not have a canonical URL of its own...).

Activity Streams 2.0

@b0rk do you need to be logged in to do this, or can you get is via rss?

@b0rk I had a play and found this

curl https ://mastodon.social/users/b0rk/statuses/109670855322107281/replies.json?only_other_accounts=true&page=true | json_reformat

(delete the space after https)

Which doesn't download all the replies at once, but looks like you can dig into the json and pull out 'next' to get more.

I don't know if it will eventually provide all the results or not.

Is that the approach you took? would love to know more.

@b0rk you can add .json at the end of the URL and it should return some API style data - with a bit of playing around I found this as a working version for this toot: https://mastodon.social/users/b0rk/statuses/109677955536109712/replies?only_other_accounts=true&page=true
@b0rk Have you checked how all the replies are pulled when you view the post on Mastodon's web interface? Because that definitely shows them all.
@b0rk Hmm, it's hitting /context once. Are you sure that's missing some?
@b0rk Okay it's not actually unlimited, it's just set to such a high number that most of the time it returns all.
@b0rk Ah, but it's also capped at lower levels if it's an anonymous API request. Maybe that's the issue you were hitting?
@effy wow yes I think that was the issue, thank you so much!!
@b0rk @effy not sure if this was mentioned but if you’re looking at replies to someone else’s toot you want to ask their server. Your server will only know the ones from people that are followed by people on your server. Their server will have them all.
@b0rk And therefore you had to, and thus you were right. 😎