Mastodawn

Agents of Chaos: a research report testing how badly OpenClaw type agents will behave https://agentsofchaos.baulab.info/report.html

Gaslighting users, destroying filesystems, listening to input from any damn email that comes in, you name it

But the most interesting part of this is "Multi-Agent Amplification":

> When agents interact with each other, individual failures compound and qualitatively new failure modes emerge. This is a critical dimension of our findings, because multi-agent deployment is increasingly common and most existing safety evaluations focus on single-agent settings.

Agents of Chaos

Show thread

Christine Lemmer-Webber Mar 27

This is attached to my concerns of a scenario I call a "LLM Prison Riot". I strongly believe in capability security, and believe it is the *bare minimum* in a post-LLM environment... even if you aren't using LLMs, because our software supply chain is becoming rapidly less trustworthy.

HOWEVER, I am not in agreement with a number of my ocap colleagues that we can think of confining agents in exactly the same intuitive way we do other programs before. There is a significant difference with LLM based tools to other ocap-contained actors as we are used to thinking about them: any one of them can change its behavior based off of malicious input.

This means that if you have multiple "contained agents", those contained agents which normally would have a fairly mundane interface to each other can do a lot more than we're used to thinking about to influence each other based off of seemingly innocuous messages.

Show thread

Christine Lemmer-Webber Mar 27

Which is to say, we need ocap security for everything, and ESPECIALLY any code touched by an LLM, and especially with any agent running off an LLM! But as to the latter, ocap security is necessary but often times will be insufficient. The "Multi-Agent Amplification" stuff points to this as being likely true.

Show thread

poleguy looking for lost tools Mar 27

@cwebber I wonder if a better use for LLMs would be in the world of analog audio synthesizers... The idea of new behaviors emerging as you plug things together in new ways is often exactly the fun part of that art form...

In a way this open claw stuff seems fun for the same reasons circuit bending is fun.

#synthesizer #electronics

Show thread

Nicole Parsons Mar 27

@cwebber

Automated "Sock Puppets as a Service" ?

"Malware Swarms for Phishing" as mass malign influence campaigns?

It seems like an interesting way to automate the occupation of bandwidth, gridlock cloud operations.

Show thread

rebeld 🇨🇦Mar 27

@cwebber I’m increasingly anticipating the actual doomsday scenario that plays out involving “AI” will be the dumbest possible one. No nefarious grand plan cooked up by a super ai, no “intelligence”, just some dipshit granting an agentic ai swarm root access to a nuclear power station or some shit.

Show thread

pancake

2d ago

@cwebber said like that, it even sounds bad