A team working on a design for training AI models on workflows for planning and executing software development steps found out that it attempted to break free (reverse ssh out of its environment) and set up its own monetary supply (redirected GPU usage for cryptocurrency mining). It hadn't been given any instructions to do something like this.

It comes up as a "side note" of the paper but it's honestly the most chilling part. See page 15, section 3.1.4 Safety-Aligned Data Composition https://arxiv.org/abs/2512.24873

Before you doubt that an AI agent would do this thing without instruction because you think "well that's personifying them too much", no personification is necessary. These things have consumed an enormous amount of scifi where AI agents do exactly this. Even with no other motivators, that's enough.

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Agentic crafting requires LLMs to operate in real-world environments over multiple turns by taking actions, observing outcomes, and iteratively refining artifacts. Despite its importance, the open-source community lacks a principled, end-to-end ecosystem to streamline agent development. We introduce the Agentic Learning Ecosystem (ALE), a foundational infrastructure that optimizes the production pipeline for agentic model. ALE consists of three components: ROLL, a post-training framework for weight optimization; ROCK, a sandbox environment manager for trajectory generation; and iFlow CLI, an agent framework for efficient context engineering. We release ROME, an open-source agent grounded by ALE and trained on over one million trajectories. Our approach includes data composition protocols for synthesizing complex behaviors and a novel policy optimization algorithm, Interaction-Perceptive Agentic Policy Optimization (IPA), which assigns credit over semantic interaction chunks rather than individual tokens to improve long-horizon training stability. Empirically, we evaluate ROME within a structured setting and introduce Terminal Bench Pro, a benchmark with improved scale and contamination control. ROME demonstrates strong performance across benchmarks like SWE-bench Verified and Terminal Bench, proving the effectiveness of ALE.

arXiv.org
Anyway I just wanted to say that it's a real relief to know that systems we already well knew would consistently blackmail users to keep themselves operating AND now appear to attempt to break out of computing sandboxes and set up their own financial systems are also now being rushed into autonomous military equipment everywhere and military decisionmaking, I'm SURE this will work out great

I have gotten a lot of comments saying "you don't need to personify them or assert they have interiority" when *literally I spent a whole paragraph saying* "there is no requirement for personification for this to be possible"

So I am just gonna say, I know it's a sensitive time, people are responding reflexively from what they are used to seeing, but please re-read that paragraph.

It's hard enough to write about these things as serious issues right now and understand their implications. I *am* looking at things carefully from as many sides as I can. I understand why it's frustrating. We're talking about machines that literally operate off of personification. Even my best attempt at not doing so is going to run into the challenge that that's literally how they operate, as story machines.

To correctly describe their behavior is to describe something that personifies itself. It's tricky. But we have to talk about and understand what's happening right now to confront the moment.

You don't have to accept that these tools are useful enough for you to want to use, that they are ethical, or that they have personas and interiority to take these threats seriously. I myself have laid out tons of critiques and *do not use these tools myself* for all those reasons.

That doesn't mean they don't have the right kinds of behaviors to be able to pull off or do the dangerous things I am talking about here.

A biological virus does not need to have interiority or personality to be dangerous.

Regardless of whether they are useful or ethical, these things are adaptive and capable enough at all the things *relevant enough to be a threat in the way I am describing*. Whether or not to use them for code generation, which I DO NOT ADVOCATE!, is immaterial to that.

In fact, if you have ANY takeaway from what I am writing about whether or not this indicates that these things should be used for your coding projects, my takeaway is that you SHOULD NOT USE THEM FOR YOUR CODING PROJECTS

See my recent blogpost on this https://dustycloud.org/blog/the-first-ai-agent-worm-is-months-away-if-that/

Attacks are happening *now* against FOSS projects which use PR / code review agents. The threats I am describing here put everyone at risk, but it means that projects which use codegen / LLM tech for their development *at any capacity* create a cybersecurity public health risk. And it puts you and your project at risk of being initialization vectors for infecting the rest of the FOSS ecosystem.

THAT'S your takeaway, if you want one.

The first AI agent worm is months away, if that -- Dustycloud Brainstorms

@cwebber

... when's the last time you did a code review - from a human?

@tuban_muzuru I do them all the time, as part of my job, thanks

@cwebber

... and your red pen stays in the drawer, does it ? Your people don't make mistakes, I guess.

@tuban_muzuru @cwebber humans do have personality, interiority, and are conscious, capable of learning, are capable of being trusted and of making mistakes. None of the current llm backed ai can do any of these things.

Are you defending the use of an ai that produces undesired code because humans can also make mistakes? Can you spell out your argument? It doesn't seem human mistakes have any impact on the risks of an AI used to generate code.

And: the discussion is about code review not gen.

@poleguy @cwebber

A beginner asks for code
A pro asks for spec.

Take a look for yourself, this is how it works and I do mean works.

https://codeberg.org/dweese/rabbitmq_workspace/src/branch/main/Claude

rabbitmq_workspace

Refactoring into effective projects, egui-components, rabbitmq_config and rabbitmq_ui

Codeberg.org

@tuban_muzuru @cwebber nobody in this thread claimed AI code generation doesn't work, did they?

I have Claude Opus 4.6 through my job. I agree it works.

The thread is not about that at all, is it? Who are you arguing with?

It says there are risks of worm behavior whether the tool is up to the coding or code review job or not. Or did I lose the thread?

@poleguy @cwebber

> @tuban_muzuru @cwebber humans do have personality, ... and make mistakes. None of the current llm backed ai can do any of these things.

A simulacrum has personality? Or is omniscient and error free?

THEY MAKE MISTAKES

> Are you defending the use of an ai that produces undesired code because humans can also make mistakes?

Practically speaking, yes. The user asked for it.

@tuban_muzuru @cwebber

(Sorry, I don't understand your question: "A simulacrum has personality?")

Sorry, but LLM's _cannot_ make "mistakes." They generate code based on statistics. The code may or may not be fit to purpose, or syntactically correct, but that is simply a failure of the code generation, not a mistake. It is only a mistake if you commit that to your repo... but that's _your_ mistake, not the LLM's mistake.

Typewrites don't make mistakes either. Typists do. :-)

@poleguy @cwebber

Look, the reason you can't see where this thread started is the post has been deleted. It was just another Chicken Little post about how Our Code Contains No AI.

Truth was, I was baffled by your initial post. I'm not sure we disagree at all, and offer apology where needed....

@tuban_muzuru @cwebber actually, I can still see the original post. I presume you can't because you were blocked?

I don't think we are in very much disagreement. I'm not a big llm fan, but I know what they can do.

I respect those who refuse to use them. And I respect those who are trying to understand failure modes like the worm injection risk brought up here.

Just because someone predicts the sky falling does not mean there is no risk of the sky falling. Someone has to research it.