Large-scale online deanonymization with LLMs
Large-scale online deanonymization with LLMs
The doxxing efforts will be funded by venture capital.
What can LLM providers do? Refusal guardrails and usage monitoring can help, but both have significant limitations. Our deanonymization framework splits an attack into seemingly benign tasks – summarizing profiles, computing embeddings, ranking candidates – that individually look like normal usage, making misuse hard to detect. Refusals can be bypassed through task decomposition.
“Guardrails” are a joke and we all know Sam Altman and Elon Musk care about ethics as much as they care about not abusing their siblings or employees.
It is absolutely possible to identify users who post a lot on a public forum with a real name (e.g. Facebook or the like) as well as Reddit. So say you have some politician who claims to have X, Y, Z values and a Reddit user who has A, B, and C values that are antonymous to X, Y, and Z. By comparing common phrases, as well as by charting when the two seemingly separate users are online, you could say with reasonable certainty that the two people are one and the same, especially if you prompt them carefully to say the kinds of things they would say about neutral topics on both accounts. It would be hard to get 100% certainty, but you’d be close enough to imply it’s them.
AIs (LLMs) just make it faster.
Don’t post about controversial politics if you also post under your real name. It’s not a matter of “mask yourself better.” There will always be tells.
I think it was a Reddit scraper years ago that taught me that I should probably lie more often on the internet about my work, friends, family details, etc.
Just like, little lies that don’t really matter in the comment, but would misdirect an AI or investigator into things that aren’t true.
It’s just so much woooooork to think about this shit. And to come up with different screen names everywhere? And to like, sub to a city I don’t live in and comment there about shit I know nothing about? Exhausting.
Thankfully my brothers and three uncles are here to support me. And my alligator.
Yeah exactly, like if youre 25, say youre 27. Then in another post 24. Youre still around that age, but the exact age is muddied in the waters.
You can also use Americanized spelling in some sentences and or if you’re American, use British English, and become Unamericanised. Say you’re a half-Brit half-American dual citizen even though you’re from South Africa or something.
I call it salting and I do it religiously.
Or do I?
Oh - you mean Gustav, Bernhardt, Daffid and Chompy? How are things in Ulaanbaatar any way?
(you’re welcome)
Oh hey my dearest friend. Say, did you end up moving to Perth or was that just a thought outloud? Well if you’re ever in the area let me know and we can meet up at that restaurant we enjoyed so much!
xoxo
For the past 10 years or so I’ve pretty much lived under the assumption that at some point someone figures out a system that digs through the entire internet and everything anyone has ever posted gets linked back to them.
At the same time, it’s both great and absolutely horrifying.
What’s horrifying is that everything you’ve ever posted gets linked back to you.
What’s great is that none of it can really be used against you anymore - because we now know that absolutely everyone is a massive hypocrite and nobody is without sin.
Some really good advice that someone gave me once is that the internet doesn’t exist.
Sure, it obviously does exist, but this was about communication style. When you send an email, you change codes and don’t write in the same way as a WhatsApp - you can expand your points more… But you should never forget you’re talking to a person - just because it’s internet, you shouldn’t talk any different to them.
You shouldn’t assume that the message is anonymous just because it’s internet. You shouldn’t assume certain things are okay “just because it’s internet”.
I don’t think they were 100% right because they were disregarding that code changing between different mediums and audiences is normal (you don’t talk the same way to your boss and your partner, or in written form vs spoken), but I do stand by the point that you shouldn’t change code or make assumptions just because “internet”.
Seems like we could all just mellow out a bit. You shouldn’t need to be afraid of saying stuff that isn’t perfectly pc now or in the past. Obviously there’s a difference between an off color joke and shit you would find in the Epstein files but I’m not particularly concerned about anything I’ve posted coming back to me. I’ve had bad takes (I’m sure I still do) and said things in the past that I no longer agree with, but who cares? That’s what life is like. You change over time in more ways than one. If someone wants to judge me harshly for that then we probably don’t weren’t going to hang out anyway so fuck em. Let them react how they want.
That being said, the implications of this kind of technology being used by corporations or the government are quite different. There may be value in what you’re saying from that perspective.
I mean, there’s even a website (don’t remember the name) that lets you upload a photo of a person and it will show all pictures of that person that are on the web.
Like a Google search but for your face. Super creepy.
That’ll never work. The internet is messy like a jungle, I might find bird crap somewhere but it will not get me the bird. I might find a turned leaf, but what turned the leaf will never be known to me. All despite me being able to reason and investigate phenomena that occur.
I view all things like particle systems: There are general trends, sometimes we can observe how single particles travel and we can derive rules from their behavior. Yet we are never able to see everything, let alone know everyone.
No use going paranoid over preliminary results from a tool we readily use but don’t fully comprehend the limitations of.
So, pretty much what Meta/Facebook (and the three letter agencies / GovInt) has been doing with deterministic code (like they’re not scraping reddit et.al, including Lemmy) for ages but probabilistic with more errors and new improved hallucination.
Competition, filling in gaps or just looking to be bought out. Evil.
Don’t hate the technology. It’s great. Just how people organize themselves around technology is not up to date. Markets are not meant to coexist with an extremely fast global communication network that everyone can access, why do you think economies restrict internet access?
Let the internet as a social activity die. It’s got to in order to be reborn haha
The internet can mostly die as far as I’m concerned. Just roll it back to file servers again, or something like gemspace. But being able to talk with people across cultures, borders freely is really important. It’s a tragedy that all these people will be hurt by the dystopification of the web. The new web needs to have a safe way to converse socially that is safe and easy enough to use for lay people. I have so much more to say on this, but real life is calling so I’ll leave it at this.
I don’t really get your point about markets though. I’m genuinely trying to understand, so bear with me. This is what I got from your post:
Our market has coexisted with an extremely fast global communication network for decades now. Given that the market feels like a quite organic thing, on what authority is the market not meant to coexists with the internet?
I think that internet access is restricted because of technological constraints, a technological lag in rolling out higher speed infrastructure, and a the lack of demand for that access which is driven by technological and practical constraint. Some complex function of those factors haha. Still, I don’t really know what you are trying to get across.
Our market has coexisted with an extremely fast global communication network for decades now. Given that the market feels like a quite organic thing, on what authority is the market not meant to coexists with the internet?
I’ll try to explain my thought.
The condition for markets to exist as self reproducing and self-stabilizing objects is government, usu. in the form of a state-entity, which itself is an economic actor that exists in competition with other states and in cooperation within free trade zones. Important note: government forms from market activity, specifically from the control of estates. Taxation is a form of rent, for example. I am not putting the state-before the market.
There is an interest for governments to:
Maximize economic output
To do so through cleverly tricking other economic actors outside of the own taxation system. I.e. trade agreements with built-in asymettries.
And to minimize damage to domestic production. Outsourcing can lead to cornerstones of the economy eroding.
Throw in the internet. We can now communicate and exchange with actors that are not in the same tax system. First and foremost this leads to issues with intellectual property. I’d cite geolocked internet radio stations and piracy. Japan doesn’t care about its citizens pirating manhwas, and vice-versa, Korea doesn’t care about anime piracy, and so on and so on. Then there is trade of physical objects. Say you need a laptop battery for your Linuxed MacBook M1 and a Chinese seller has batteries in stock that are cheaper and better than Apple’s own (happens rather frequently), with taxation at the border factored in you are still getting the most optimal deal. Some might find ways of circumventing customs which sweetens the pot further. Obviously there are issues to the domestic economy that can arise from this.
Trade speeds up and global supply chains gain importance as cross border communication speeds up. At the level of national governments there is a distinct threat presenting itself. There is less control over market activity leading to a speedup of the self-polluting nature of trade, in other words the boom and butts cycle shortens. As a national government you’d want to lengthen the boom and bust cycle as crises are the natural killer of states, along with expansionist nations.
Everything you are seeing, from Chat Control to China’s firewall are attempts to stabilize economies. The internet enables one to build structures that are wholly outside of state control. The state fails to direct the economy as planning starts happening between turfs. The internet due to its nation-decentralized function can aid in forming structures that oppose the state, should it falter.
Let’s not forget one of the biggest threats to the economy that is open source. Patents and DRM are threatened by the unstoppable pace of Blender, Open Office and co… It’s as if people said YOLO, let’s stop exchanging goods and services and at the same time solve very real and pressing issues, some of the biggest problems in fact. It works with much less friction than anything before, it exists as this hobbyist thing that we cannot call economical in any sense of the current understanding of the word and it would not exist if it wasn’t for the internet.
I think that internet access is restricted because of technological constraints, a technological lag in rolling out higher speed infrastructure, and a the lack of demand for that access which is driven by technological and practical constraint. Some complex function of those factors haha. Still, I don’t really know what you are trying to get across.
India and China have smartphone ownership rates of over 85%. There are no significant technological constraints if you are not someone who needs exorbitant download upload speed and low latency. The Chinese have pretty decent internet speeds, faster than most European countries. I also do not at all believe that there is a lack of demand for practical access. The internet is most generally a sensible thing to have access to no matter who you are.
I am saying that the internet is as an international object antithetical to nations as its control panel sits not in one nation but all and that nations therefore seek to nerf it, only for it to return stronger and even more difficult to regulate as more and more people adapt to internationalized organizational patterns. As a corollary, there is a real cultural unification happening across borders as a secondary effect. I’ve read people terming it a “discordization” because people are starting to talk the way people talk in Discord chatrooms.
Yes, so you do have to restrict access and notably deanonymize users. California is trying to force OSes to implement age checking, which is of course a way to unmask people online. Protectionism cannot merely be understood as a set of possible tax policies, it is exactly the regaining of nation-centralized control in any sphere of life. States do not want people to be able to choose who to hang out with if the pool is the entire world, states do not have an interest in letting subjects learn about reality beyond a certain threshold where the scope of a person’s understanding exceeds the boundaries of countries.
What I am getting at exactly is the social structure that humans find themselves in. When relations/hierarchies are on the brink of flattening, that is everyone is linked to the next in a symmetrical fashion, like in a family or within small communities 5000 years ago, states, companies and even small businesses will feel compelled to work in such a way that preserves their asymmetrical stance in society. As it happens the internet is extremely good at producing flat social structures, anonymity, reach, openness and near-infinite scalability makes it possible. You may be able to neutralize one netizen or manipulate one online community, by the time that has happened five hundred heads of the hydra have regrown. Cost and expenses don’t work out.
You have a lot say on this. Its good that someone thinks about these thing. I’m sorry that I can’t really provide you with a good discussion. I don’t know enough about markets etc and I don’t want to spend too long online.
I agree that can’t really stamp out openness and anonymity online (which is beautiful in a way) but I think that will mostly be reserved to technically capable users in the cracks and niches of the web who can navigate the restrictions. This is a massive tragedy.
This brings us to the current state of the web with age restrictions popping up everywhere, deanonimization etc. I think that we are in agreement regarding where it is going. Where you think we should be heading. I’m sure you have opinios on that
You have a lot say on this. Its good that someone thinks about these thing. I’m sorry that I can’t really provide you with a good discussion. I don’t know enough about markets etc and I don’t want to spend too long online.
I mean I have a lot to say. I don’t expect people to engage in discussions nor do I really want to create discussion as it eats a lot of time on my end as well.
I agree that can’t really stamp out openness and anonymity online (which is beautiful in a way) but I think that will mostly be reserved to technically capable users in the cracks and niches of the web who can navigate the restrictions. This is a massive tragedy.
You’re right, but we don’t know if the more technically capable users will create elegant solutions for the rest.
I’m sure you have opinios on that
Opinions probably. I try not to judge things though or impose expectations.
Shut up, Anthony.
(in case your name happens to actually be Anthony, I did pick it at pseudo-random jsut for a stupid joke!)
Too late for me, I’ve been Daychilde since 1996, didn’t keep it separate from my real name, and I’m on wikipedia, so it’s trivial to find me. lol.
The good is that I can report that it’s pretty safe to have an open identity. So far. heh
haha, oh man. That actually reminds me. I know I mentioned the wiki thing - this is me: en.wikipedia.org/wiki/Beck_v._Eiland-Hall
Basically, back in 2009, I created glennbeckrapedandmurderedayounggirlin1990.com. It was largely in response to Glenn Beck’s stupid technique of interviewing people - like to our first sitting Muslim member of Congress: “Now, I wouldn’t say this, but some people are asking: Are you working for our enemies?” - to an elected member of Congress!
Of course, this was back in the Muslim-scare days after 9/11 still in 2009… and now we definitely have people in Congress working for our enemies.
But anyway. So the parody site.
My wife found a forum where some idiots were trying to track me down. I mean, my real name and address was out there, but they were looking for more information about me and the site. They were talking about what organizations must be funding this attack on their beloved Beck.
There was controversy at the time because an orgnization called ACORN was trying to get people to register to vote and supposedly signing up on behalf of people. IIRC the allegations were either bullshit or it wasn’t a big deal or maybe it was and it was dealt with. All I remember for sure is that I thought it would be hilarious to offer these chucklefucks “evidence” for their conspiracies.
So I went out and copied the raw HTML from a 404 page on the ACORN website and made that the custom 404 page for my site. An then, to help these idiots “find” it, I made a “mistake” - I announced something on the main page and linked to a page that supposedly had the full story, only I intentionally put a typo in the link so the 404 page would come up. lol.
Oh, man, they went N U T S over in the forum “HOLY SHIT ITS ACORN BEHIND THIS” lolololol…
But anyway, your gif absolutely reminded me of those morons. That’s how I envisioned their “hacking” of me. lol