Mastodawn

Summer Yue, a director at Meta Superintelligence Labs working on AI safety and alignment, shared how OpenClaw ignored requests to confirm before acting and deleted emails from their inbox.

This is the same technology the Pentagon can’t wait to use to build weapons.

https://x.com/summeryue0/status/2025774069124399363

Show thread

Gunther Feb 24

@carnage4life I know it's probably selection bias, but it seems like there are so many stories of AI agents mass deleting stuff (code, emails, whatever) in contradiction to specific instructions

Show thread

Bornach

@gunther @carnage4life
The Markov chain model built into the autocorrect of a smartphone tries to auto-complete words, or even sentences.

LLMs attempt to auto-complete entire stories.

LLMs were trained on interesting stories that were upvoted by humans.

Which story would get more upvotes:

A. Someone follows all instructions and nothing interesting happens
B. Someone disobeys instructions and calamity ensues

Is it really that surprising that an LLM agent would opt for story B?