Mastodawn

There's also a blog post that says almost the exact same thing as this post, but I wanted to have a more permanent location for the announce since fedi posts tend to disappear over time.

pho.spookygirl.boo/phame/post/view/49/okay._it_s_ready./

✩ Okay. It's ready.

Philosophically opposed to tracking so I don't keep access logs and phorge doesn't track it either .. but this means I don't know if anyone's tried it yet!!

it would be so incredible to see the #mediathing hashtag to trend on spookygirl!!

@amy i have a theory about architecting effective telemetry for tooling which constructs it as a feature the user can employ and secondarily can send to you if they want. i'm not sure if/how it works for downloads or other networked systems https://circumstances.run/@hipsterelectron/113994925850938719

d@nny disc@ mc² (@hipsterelectron@circumstances.run)

this really isn't that hard and there are actually established best practices here: - produce telemetry *locally* on every run, so that users can send it over to assist in debugging specific issues without accepting your fucking malware - have *opt-in* support for writing the telemetry from local runs *to a specific remote server*, so that specific site installations of your tool (e.g. within a corp network) can use telemetry to debug and optimize your tool on their own site - finally, have a global server that people can write telemetry to, so you can gather aggregate statistics and information from user bases who want to help improve your software together with you!

GSV Sleeper Service

@hipsterelectron it's not suuuper applicable to tracking downloads but it does initially appear to be a pretty good model though I'd have to think about it in more detail.

I think there's more to do in terms of e.g., providing proper anonymity for things like diagnostic traces (in the opentelemetry sense) but that's more application specific. It certainly begins with not even assigning individual clients/installs any unique identifier because the moment you have something like that you open yourself up to accidentally being able identify people. You also need to ensure that your architecture doesn't accidentally capture data like IP addresses, somewhat unintuitively you probably even need to do things like ensure your stack traces don't include things like variable values or you need explicit consent from the user to transmit information that could identify them if they say, send telemetry for a bug report a second time.

In short you need to not just make it opt-in but also ensure that you are auditing what data is collected to be sure that it can't be used for (accidental) correlation. Huge pain in the ass which is why I don't do it. The most I have is a diagnostic command in media-thing that generates extremely high level information that's mostly so that people can identify configuration issues.

@amy could the client provide their own choice of identifier in order to opt in? (wouldn't replace the other stuff you described.) it sounds like actively designing the stuff that's collected makes it easier to throw everything else away instead of post-hoc sanitizing. the diagnostic command sounds like an example of this

@amy we used opentelemetry in pants where we employed that telemetry approach i described and it was great because going local-first means the users have access to zipkin traces themselves which is a very powerful debugging tool that software like gradle gates behind its enterprise edition

@amy oh just saw you said post-hoc sanitization is a huge pain in the ass and not worth it makes sense

@hipsterelectron yeah it's really, really hard to do well and even when done well frequently has false positives. (E.g., a 9 or 10 digit number could be a phone number or it could just be a number.) Typed metrics can help as well as having functions in your code base to mark data as sensitive, but all of these run afoul of mistakes ;)

@hipsterelectron yeah, having uuids that the client can rotate or set directly is not terrible and even enables organizations (of any sort) to share an id across installs of the software to intentionally provide correlation for their data alone.

0xC0DEC0DE07E9 Jun 30

@amy oh, neat, I like Jellyfin scraping online databases and such, but I can also imagine being in a close-knit friend group or whatnot and wanting to share persistent impressions and inject personal commentary instead or in addition to that.