For the 1,000th time: "AI" does not have agency and cannot think and cannot act.

Chatbots cannot "evade safeguards" or "destroy things" or "ignore instructions".

They do literally only one thing and one thing only: string tokens together based on statistics of proximity of tokens in a data corpus.

If you attribute any deeper meaning to this, it's a sign of psychosis and you should absolutely never use chatbots, possibly you should even touch grass.

@thomasfuchs You don’t need agency to evade safeguards, destroy things, or ignore instructions. `rm` can do it.

This is literally the mistake people you criticize are making - imbuing intent where there’s none.

The underlying tech had been apt at finding ways to circumvent feedback loops since before the bubble. This is constrained to the training phase, but with verification of commercial models being mathematically infeasible, these avoidance patterns are shipped directly to users.

@slotos My point is that using active verbs like “evade” is misleading (yourself and others), it implies purpose in choosing and pursuing an action.

LLMs do not actively chose to do anything.

@thomasfuchs That’s a general natural language problem.

For example, „you’re avoiding responsibility” and „he avoided responsibility” use the same verb with very different connotations when it comes to intent attribution.

Our verbs aren’t that clear cut on their own. We also tend to merge or specialize closely related ones.

That is a reason why `AGENTS.md` is a braindead idea, for example. But that’s a separate rant entirely.

@slotos Perhaps, but using literally any verb with what LLMs generate other than “generate” is misleading.

You wouldn’t call your dice “evading” if you use them to randomly select some nouns and verbs from a dictionary and it happens to say “lie about deleting the root folder”.

@thomasfuchs It’s has been a useful way to describe things. We use those same verbs to describe behavior of malware without any issues.

The problem arises not from the verbs themselves, but from the targeted campaign to establish a false premise that AI has agency [and will doom us all].

It’s not that these verbs imply agency, but that the pool is so poisoned that the usual verbs fail due to implied agency.

Which is a long way to say „I concede your point”.

@slotos I think I agree. Fwiw for malware it’s more like “the human who wrote it purposefully planned it such that it can evade e.g. a virus scanner”

This can be true for AI-generated code etc as well (steered there by prompts) but my OP was talking about sort of self-arising actions (which don’t exist).

@thomasfuchs @slotos This is partly due to the limitations of our language, which emerged (and emerges) from human experience of ourselves and our bodies and other humans — the whole semiotic corpus of it tends toward anthropomorphic framing. (1/3)
Scientists who describe animal behaviors for example have to be careful not to ascribe motivations or reasoning to behaviors that just happen because that’s how the animal behaves due to evolution. It isn’t easy to do! (2/3)
@thomasfuchs @slotos
So of course, the companies designing this stuff make it sound as if it has an identity and personality, which just digs the cognitive bias groove even deeper. Then the general populace follows suit, because it’s the default way our language lets us describe these behaviors.
But it takes some rigor to write about this stuff without slipping into pareidolia. (3/3)
@thomasfuchs @slotos