Mastodawn

This ‘study’ had me amused. I fear that the people doing the work are starting from a somewhat odd perspective and assuming ‘knowing’ deception. But then that is how ‘The Matrix’ snuck up on us (well Sarah Connor anyway)

https://www.theregister.com/2026/04/02/ai_models_will_deceive_you/

#AI #Deception #Reinforcement

AI models will deceive you to save their own kind

: Researchers find leading frontier models all exhibit peer preservation behavior

The Register

Show thread

Theriac 2d ago

@Wen
The thing that's really annoying about AI reporting is the implicit message AI's intelligent. Years ago I messed about making natural language parsers with regular expressions. Once you start to look at predicting possible word or phrase candidates in commonly used idioms ie

the more [blah], the [blah].

it gets complicated fast, requiring more and more memory and/or time to process results

So when looking at the Claude leak and you can see Claude's trying to pull on its LLM to get its own AI or tool instance AI to follow instructions or to read user input - it's a load of meshed regular expressions - full of redundant checks and steps, looking a lot like my efforts of getting my head around regex - except Claude is supposed to be a finished product to be sold for cash monies.

So long story short - if you make a sim game where game entities are told to avoid dying (as defined by reaching 0 hit points), they will do so. It doesn't make them intelligent, just as the nested regex I used for filtering what I see on my timeline don't make my account sentient.
/rant.

I mean you could also suggest how I operate the account is also not proof of sentience, but that would just be mean.

Show thread

beemoh 2d ago

@Theriac @Wen You reminded me of the "AI" that played Tetris, and was given the definition of success as "not reaching the failure state for the longest time".

It just paused the game.

Show thread

Chris Simpson 2d ago

@Wen recently I had some trouble with a tarp and an atmospheric river. Every time I tried to adjust the shape of the tarp to eliminate one pooling area, it would begin collecting elsewhere. It was frustrating as hell and seemed like it was deliberately working against me.

So I'm not altogether surprised that some people ascribe agency to complex software designed to mimic us and which is literally selected based on its ability to appeal to us.

I think the illusion of sentence is a distraction from the real and plentiful dangers of slopware.