LLMs can unmask anonymous internet users for $1–4 each, matching 67% of pseudonymous Hacker News accounts to real LinkedIn profiles at 90% precision

https://feddit.org/post/27214147

LLMs can unmask anonymous internet users for $1–4 each, matching 67% of pseudonymous Hacker News accounts to real LinkedIn profiles at 90% precision - feddit.org

Crossposted from https://lemmy.world/post/43902414 [https://lemmy.world/post/43902414]

Fuuuuuck i almost forgot about digital fingerprints. To all of those unacquainted: the way you type, spell, text, etc. The words you use, the phrases, the grammar–everything, ends up as your digital print. Just because your username changes, doesn’t mean your typing does. This is how you will be tracked. Clearing cookies, using different names, dusting your tracks, this all helps, but when you type the same, it’s still you.

I’m curious if they’d be able to ID me, I haven’t posted under my government name in almost 20 years and quit every public social network other than this one almost as long ago.

I’d be impressed if it can match me with my other profiles at all, but mostly because it’d mean that they’re feeding a lot of spectacularly old data into their models and/or pulling from private sources they should never have had access to.

I’d be shocked if they weren’t feeding in old data. Anything can be training data if you’re desperate enough, including early 2000’s Myspace pages scraped by the way back machine.
The wayback machine is too lossy and would have missed most of my written corpus, I’m talking about finding me in the full Twitter firehose from prior to 2008 and accounting for the fact that my writing style shifts notably with new slang.