A metaphor on agentic AI and what not to do:

  • Summon a demon that's meant to be helpful
  • Accept everything that happens from then on, including the responsibilities and getting the blame for what can happen
  • Give the demon autonomy to do whatever it wants to fulfill its own desires and wishes first and foremost
  • Don't isolate it to any dedicated cage where it can only see what's actually needed to help out, and nothing else
  • Witness the demon taking a weird route and acting maliciously, despite hearing a suggestion "Please, don't be evil"
  • Keep on using it

#ai #deltarune #llm

Without metaphors, I find it hard to understand how a utility that can act in unpredictable ways can have no sandbox measures in the first place imposed by the operator.

Not everyone has a threat model where an agent could attempt using CPU bugs to escape a Xen VM, but a mere QEMU VM with a shared folder, with only the access to the important data per project, would be a step in the right direction.

@aronowski not that I'm much of a fan of Fancy Autocorrect That Includes Actions In Its Pattern Guessing, "general purpose" agents give me hives. Who the fuck ever thought that was a good idea... oh wait techbros and brain-free vulture capitalists exist. Sigh.
LLMs are indistinguishable from malicious genies

If your AI agent doesn't need human-like rights, then it doesn't have human-like intentions. It might be 'creative', but it isn't *conscientiously* creative. And that makes it indistinguishable from a malicious genie. 1300 words - 6.5 minutes