Mastodawn

Whistle Apr 30

󠀁󠁓󠁩󠁬󠁬󠁹󠀠󠁯󠁲󠁡󠁮󠁧󠁥󠀠󠁣󠁡󠁴󠁿

A metaphor on agentic AI and what not to do:

Summon a demon that's meant to be helpful
Accept everything that happens from then on, including the responsibilities and getting the blame for what can happen
Give the demon autonomy to do whatever it wants to fulfill its own desires and wishes first and foremost
Don't isolate it to any dedicated cage where it can only see what's actually needed to help out, and nothing else
Witness the demon taking a weird route and acting maliciously, despite hearing a suggestion "Please, don't be evil"
Keep on using it

#ai #deltarune #llm

Show thread

󠀁󠁓󠁩󠁬󠁬󠁹󠀠󠁯󠁲󠁡󠁮󠁧󠁥󠀠󠁣󠁡󠁴󠁿Apr 29

Without metaphors, I find it hard to understand how a utility that can act in unpredictable ways can have no sandbox measures in the first place imposed by the operator.

Not everyone has a threat model where an agent could attempt using CPU bugs to escape a Xen VM, but a mere QEMU VM with a shared folder, with only the access to the important data per project, would be a step in the right direction.

Show thread

Mathias ❄️🐺🐶

Apr 29

@aronowski not that I'm much of a fan of Fancy Autocorrect That Includes Actions In Its Pattern Guessing, "general purpose" agents give me hives. Who the fuck ever thought that was a good idea... oh wait techbros and brain-free vulture capitalists exist. Sigh.

Show thread

felix (grayscale) 🐺Apr 30

@aronowski https://felix.dognebula.com/art/malicious-genie.html

LLMs are indistinguishable from malicious genies

If your AI agent doesn't need human-like rights, then it doesn't have human-like intentions. It might be 'creative', but it isn't *conscientiously* creative. And that makes it indistinguishable from a malicious genie. 1300 words - 6.5 minutes