Mastodawn

'I had to RUN to my Mac mini like I was defusing a bomb': OpenClaw AI chose to 'speedrun' deleting Meta AI safety director's inbox due to a 'rookie error'

https://lemmy.dbzer0.com/post/64328574

'I had to RUN to my Mac mini like I was defusing a bomb': OpenClaw AI chose to 'speedrun' deleting Meta AI safety director's inbox due to a 'rookie error' - Divisions by zero

Lemmy

Show thread

renzhexiangjiao Feb 24

you can like… enforce this rule programatically? you don’t have to say “pretty please” to ai? basically, when AI requests some potentially unwanted thing (like deleting an email), this request goes through a proxy that asks the human for confirmation. Also you can have a safe word set up in the chat interface to act as a killswitch. I thought these are ABCs of ai safety but apparently these are foreign concepts to this “safety director”

Show thread

underscores

The people that design AI tools don’t implement guardrails because then they’d have to admit AI is not ready for the shut they’re trying to make

Show thread

rumba Feb 25

AI will never be ready. Humans aren’t ready either. That’s why IT staff uses guardrails for users :)