Mastodawn

so ai is going great

https://m.slashdot.org/story/443463

@Viss this simulation is flawed by the fact that they prompted the "AI" that it should pick between blackmailing or letting itself be shutdown. They gave it no room for attempting non-hostile solutions.

Anthropic keeps making these headline grabbing sham "studies"

Show thread

Pxl Phile Jun 20

@Kiloku @Viss i thought something similar.

Anyhow I am surprised that all jump on the Bad-AI-is-gonna-kill-us bandwagon instead of realizing it is reality disturbing Omni Consumer Products from RoboCop that we are witnessing

Show thread

Viss Jun 20

@ppxl @Kiloku i figure the question becomes real simple:

even if you think anthropic is goosing the tests, shouldnt the llm ... not do blackmail? i mean even if you told it to? that seems like the obvious expectation here.

Show thread

Kiloku

@Viss @ppxl It's not an intelligent being, despite the name. It does whatever matches the content in its training data + the prompts it is given. There's no "should", it's a result of statistical calculations on frequency of chunks of text. Nothing about LLMs is obvious or expected, they are unpredictable.
Just like they often output wrong information about factual topics, it's not surprising that they do "wrong" behaviors in simulations.

Show thread

Viss Jun 20

@Kiloku @ppxl have you trained an llm before?

Show thread

Pxl Phile Jun 21

@Viss @Kiloku yeah biases AND implementation details kick AI responses. Underlying racism, insufficient and flat false training data etc.

Ugh that reminds me that I wrote my own Markov AI from scratch and trained it on my (back then) tweets. Results were catastrophic 😅

Show thread

Viss Jun 21

@ppxl @Kiloku heh, i did that ebooks bot thing too. was pretty hilarious :D

Show thread

Pxl Phile Jun 21

@Viss @Kiloku at times true... but the training data was really insufficient. And my implementation was flawed (naturally, to prove my point). I just stumbled over the code base recently and thought of a Golang re-implementation with some buff-up