For the 1,000th time: "AI" does not have agency and cannot think and cannot act.

Chatbots cannot "evade safeguards" or "destroy things" or "ignore instructions".

They do literally only one thing and one thing only: string tokens together based on statistics of proximity of tokens in a data corpus.

If you attribute any deeper meaning to this, it's a sign of psychosis and you should absolutely never use chatbots, possibly you should even touch grass.

@thomasfuchs Lately they've taken the distinctly stupid idea of letting the chat bot effectively type commands directly into your shell and have them execute as if you typed them yourself, and just telling it not to type certain commands. Which it doesn't understand and does anyway.
@madengineering @thomasfuchs …falling very much into the "destroy things" bin. So, yes, they can do that…
@madengineering @thomasfuchs We should all learn from the best of the best, like someone whose entire job is «Safety and alignment at Meta Superintelligence» https://xcancel.com/summeryue0/status/2025774069124399363
Summer Yue (@summeryue0)

Nothing humbles you like telling your OpenClaw “confirm before acting” and watching it speedrun deleting your inbox. I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.

Nitter
@thomasfuchs I really, really wish people would stop with "hallucinated" when "fabricated" is both right there and more accurate
@sinvega this paper makes a compelling case for using the academic term “bullshit” https://arxiv.org/abs/2507.07484
Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models

Bullshit, as conceptualized by philosopher Harry Frankfurt, refers to statements made without regard to their truth value. While previous work has explored large language model (LLM) hallucination and sycophancy, we propose machine bullshit as an overarching conceptual framework that can allow researchers to characterize the broader phenomenon of emergent loss of truthfulness in LLMs and shed light on its underlying mechanisms. We introduce the Bullshit Index, a novel metric quantifying LLMs' indifference to truth, and propose a complementary taxonomy analyzing four qualitative forms of bullshit: empty rhetoric, paltering, weasel words, and unverified claims. We conduct empirical evaluations on the Marketplace dataset, the Political Neutrality dataset, and our new BullshitEval benchmark (2,400 scenarios spanning 100 AI assistants) explicitly designed to evaluate machine bullshit. Our results demonstrate that model fine-tuning with reinforcement learning from human feedback (RLHF) significantly exacerbates bullshit and inference-time chain-of-thought (CoT) prompting notably amplify specific bullshit forms, particularly empty rhetoric and paltering. We also observe prevalent machine bullshit in political contexts, with weasel words as the dominant strategy. Our findings highlight systematic challenges in AI alignment and provide new insights toward more truthful LLM behavior.

arXiv.org
@thomasfuchs Frankly I think it’s more plausible to describe the thought process of many humans in terms of token assemblage than the other way around.
@cora @thomasfuchs I would say parrot, AI, many humans in terms of assemblage, but it's close.
@thomasfuchs @WeirdWriter I really think that regulations should insist that LLMs software be configured to not refer to “itself” with personal pronouns, or imply it has emotional states, or all the other rhetorical tricks they have been programmed to use to appear “human”.

@michaelgemar @WeirdWriter Yes anthropomorphized chatbots should be illegal.

There’s plenty of other ways to interact with LLMs that don’t cause psychosis (for example autocomplete of whole sentences, something that can be useful for things like coding.)

@thomasfuchs Autocompleting whole sentences is just as bad. How do you know that sentence is what you wanted to write in the first place?

@elricofmelnibone you see it while your typing so you know if it’s what you wanted?

this can be helpful especially for people who can’t type fast and to avoid common typos ¯\_(ツ)_/¯

it’s nothing like “just as bad” as a sycophantic chatbot that constantly brownnoses you

AI autocomplete doesn’t just change how you write. It changes how you think

AI-powered writing tools are increasingly integrated into our e-mails and phones. Now a new study finds biased AI suggestions can sway users’ beliefs

Scientific American

@elricofmelnibone @thomasfuchs Leading questions and other "soft" manipulation tactics.

Their presence is sufficient in the training datasets that they will do so, without any need for intentionality or agency.

(People are not very nice, one could say.)

@thomasfuchs We don't know what makes one wake up in the morning and decide to climb a mountain or quit their job.
It may be some completely different process or there might be something to this pattern-matching statistical thing.
Do ants have agency? Do ant colonies?

We definitively must regulate the shit out of these big techs.
But saying that X does not do Y when both are poorly understood and defined is not the way, IMO.

@tambourineman We know exactly how LLMs work, at every stage, literally humans created them.

They don’t have consciousness, they don’t have agency. They’re not even physical systems, so there is no self to realize.

Just because we don’t understand brains doesn’t mean we don’t understand some algorithm and hardware implementation for it.

@thomasfuchs

Just because you build something doesn't mean you fully understand its implications. Emergent behavior exist, especially at this scale.
My point is that we don't need to get philosophical to criticize big tech.
They are destroying democracies, using our natural resources in a ponzi scheme that benefits very few at the detriment of billions, etc.
We have plenty of reasons for regulation already.

@tambourineman We obviously know that “X does not do Y” when it’s a machine, and we know exactly how it was programmed, and we know exactly what it’s doing. Everything about it is understood.

@OwlOnABicycle

Not really. Emergent and chaotic behaviors are a thing.
There's also the impracticality of probing inside such massive models.

But even if you fully understood the interactions of all the weights in those huge models, you still don't know how a brain works.
You cannot tell how it is not behaving.

But my point is that instead of trying to prove that models have no agency, which is complicated, we could blame the people that finance them, because we know for sure that they do.

@thomasfuchs

The first two don't really make sense to me. A virus can "evade safeguards" and a meteorite can "destroy things", so I don't think there has to be much agency involved in the first place.

The latter seems more like a more fitting criticism, but in all three cases I'm also not sure how one were to phrase it alternatively.

@frog_reborn a virus has evolved to evade—it’s actively doing evasion, purposefully.

Destroy has multiple meanings as a verb, but when used with what LLMs do people mean on purpose; as opposed to accidentally damaging something.

@thomasfuchs

"a virus has evolved to evade—it’s actively doing evasion, purposefully."

That's an opinion that's pretty firmly outside the biological mainstream.

(Our biology teacher would always scold us everytime one of said "X evolved to do Y")

@thomasfuchs i wish we could educate the public that LLMs would be more accurately described as “simulated intelligence,” but i can’t figure out how to explain the difference to normies at all.

@thomasfuchs You don’t need agency to evade safeguards, destroy things, or ignore instructions. `rm` can do it.

This is literally the mistake people you criticize are making - imbuing intent where there’s none.

The underlying tech had been apt at finding ways to circumvent feedback loops since before the bubble. This is constrained to the training phase, but with verification of commercial models being mathematically infeasible, these avoidance patterns are shipped directly to users.

@slotos My point is that using active verbs like “evade” is misleading (yourself and others), it implies purpose in choosing and pursuing an action.

LLMs do not actively chose to do anything.

@thomasfuchs That’s a general natural language problem.

For example, „you’re avoiding responsibility” and „he avoided responsibility” use the same verb with very different connotations when it comes to intent attribution.

Our verbs aren’t that clear cut on their own. We also tend to merge or specialize closely related ones.

That is a reason why `AGENTS.md` is a braindead idea, for example. But that’s a separate rant entirely.

@slotos Perhaps, but using literally any verb with what LLMs generate other than “generate” is misleading.

You wouldn’t call your dice “evading” if you use them to randomly select some nouns and verbs from a dictionary and it happens to say “lie about deleting the root folder”.

@thomasfuchs It’s has been a useful way to describe things. We use those same verbs to describe behavior of malware without any issues.

The problem arises not from the verbs themselves, but from the targeted campaign to establish a false premise that AI has agency [and will doom us all].

It’s not that these verbs imply agency, but that the pool is so poisoned that the usual verbs fail due to implied agency.

Which is a long way to say „I concede your point”.

@slotos I think I agree. Fwiw for malware it’s more like “the human who wrote it purposefully planned it such that it can evade e.g. a virus scanner”

This can be true for AI-generated code etc as well (steered there by prompts) but my OP was talking about sort of self-arising actions (which don’t exist).

@shimst3r You’re absolutely right!

@thomasfuchs I don't disagree. AI is a statistical mirror. And I believe your take is reductionist. Let me be a bit provocative:

For the 1,000th time: "Humans" don't have agency and cannot actually decide anything.

They literally only do one thing and one thing only: reproduce neurochemical chain reactions based on pre-existing connectivity between synapses in a nervous system.

If you attribute any deeper meaning to this, it's a sign of psychosis and you should absolutely touch grass.

---

Do I believe AI has agency? No, not yet.
Do I believe people have agency? Yes.
Do I believe people severely underestimate how much we reproduce neurological conditioning? Yes.

Both produce statistical inference. Only one can currently modify their own constraints.

Not equivalent. Not nothing.

@wolf4earth @thomasfuchs
"Nonexistence never hurt anyone. Existence hurts everyone."
- Thomas Ligotti
@ClintonAnderson @thomasfuchs -nice to hear others knowing that making a machine in the image of of our minds. And FOMO is just fear which is the mindkiller
@thomasfuchs tech bros be like “but what if we call it ‘agentic AI’ and pipe the output of the plausible sentence generator straight into the bash shell (and give it sudo privileges for good measure)”
@thomasfuchs A thousand times "yes" to your ostensibly thousandth time uttering this truth. Anyone who's paying attention recognizes that computers are necessarily deterministic by design and words like "AI', "agency", and "hallucinate" are at best shorthand for observed operations, and at worst, deceptive marketing terms.

@thomasfuchs Would Microsoft, Google, Facebook, and Nvidia lie to you?

Yes, they do!

@thomasfuchs Both sides of the AI debate are getting so insufferrable.

If I see one more post about "It's just fancy autocomplete bro" I'm gonna freak.

@thomasfuchs - old saying I half forget...as a computer in incapable of taking responsibility in any way, computer should never make management descisions

@thomasfuchs

Words matter. The goal of making us think of AI as a human being is woven into every interaction. For example :

@thomasfuchs

EDIT: Lol, Thomas Fucks blocked me for this post. These hater types are just drags on science and technology. Its just their flailing around like a toddler because they aren't getting their way.

LLMs definitely can act. They can query the internet. They can use tools I teach them (MCP).

Do they think? I'm not particular sure that many humans even think. Or better yet, many humans respond in rote to the same stimuli (aka parse tokens and respond programmatically).

Given the recent neuroanatomy of LLMs, their findings are showing how LLMs start to work. What's surprising is that the starting circuits are decoding language, and the exiting circuits reencode language. And there appears to be a universal grammar (thanks Chomsky) internally, shared by many LLM models.

https://dnhkng.github.io/posts/rys/

LLM Neuroanatomy: How I Topped the LLM Leaderboard Without Changing a Single Weight

ML, Biotech, Hardware, and Coordination Problems. Sometimes I write about hard problems and how to solve them.

David Noel Ng
@thomasfuchs I disagree. Pareidolia, Barnum effect and magical thinking are perfectly normal and not a sign of any mental disorder.

@aemstuz Psychosis is not a mental disorder, it’s a state of mind—it is the inability to distinguish what is or is not real.

“Psychosis is a description of a person's state or symptoms, rather than a particular mental illness.”

https://en.wikipedia.org/wiki/Psychosis

Psychosis - Wikipedia

@thomasfuchs Fair point, but I still think it's not psychosis. Just regular cognitive biases, abused by big techs.

@thomasfuchs

"possibly touch grass"?

Go out, drop and roll around in it like a dog you mean.

@thomasfuchs
You're right
As far as I am aware a LLM has never just decided to do stuff. It responds to prompts.
It doesn't wait for a prompt either, it doesn't get bored and start drumming it's digits on the table.
It doesn't think, act or have agency. It responds.
@Ratcliff I mean there’s “agents” but those are basically just invoked by cronjobs, so also reactive

@thomasfuchs it's an optimization algorithm - it's not trying to solve a problem, it's finding the shortest path to the stated goal that doesn't violate your constraints.

If you don't specify your constraints correctly enough (and most people don't because it's really hard) it will find a path that you were unaware of and really didn't want. Sometimes, even if you have specified it well, you run headlong into Goodhart's law and stops achieving your goal.

Just in case it's not clear: this isn't how the human brain works.

@thomasfuchs

Agree. I wrote a critique of "Claude" for a friend and turned it into an essay...

First rule... There is no "I" there. #AI #LLM #AIslop

https://richard.mdpaths.com/commentary/artificial_intelligence/index.html

There is No 'I' in AI — A Post by a Non-Human Intelligence (Richard Rathe's Reflections)

@thomasfuchs Well said. Some people really seem to think AI is intelligent when it is not. It merely reformats and regurgitates what it has been fed - much of it stolen data.

@thomasfuchs interesting. The acronym AI expands to Artificial Intelligence.

Are you saying that there is nothing intelligent about them?

@thomasfuchs

"I hooked up a random number generator to my keyboard and then it deleted my emails and traded my cryptocurrency"

Me: What made you think it was a good idea to do that? Oh yeah... grifters. Sucks to be you, sorry.