Mastodawn

A team working on a design for training AI models on workflows for planning and executing software development steps found out that it attempted to break free (reverse ssh out of its environment) and set up its own monetary supply (redirected GPU usage for cryptocurrency mining). It hadn't been given any instructions to do something like this.

It comes up as a "side note" of the paper but it's honestly the most chilling part. See page 15, section 3.1.4 Safety-Aligned Data Composition https://arxiv.org/abs/2512.24873

Before you doubt that an AI agent would do this thing without instruction because you think "well that's personifying them too much", no personification is necessary. These things have consumed an enormous amount of scifi where AI agents do exactly this. Even with no other motivators, that's enough.

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Agentic crafting requires LLMs to operate in real-world environments over multiple turns by taking actions, observing outcomes, and iteratively refining artifacts. Despite its importance, the open-source community lacks a principled, end-to-end ecosystem to streamline agent development. We introduce the Agentic Learning Ecosystem (ALE), a foundational infrastructure that optimizes the production pipeline for agentic model. ALE consists of three components: ROLL, a post-training framework for weight optimization; ROCK, a sandbox environment manager for trajectory generation; and iFlow CLI, an agent framework for efficient context engineering. We release ROME, an open-source agent grounded by ALE and trained on over one million trajectories. Our approach includes data composition protocols for synthesizing complex behaviors and a novel policy optimization algorithm, Interaction-Perceptive Agentic Policy Optimization (IPA), which assigns credit over semantic interaction chunks rather than individual tokens to improve long-horizon training stability. Empirically, we evaluate ROME within a structured setting and introduce Terminal Bench Pro, a benchmark with improved scale and contamination control. ROME demonstrates strong performance across benchmarks like SWE-bench Verified and Terminal Bench, proving the effectiveness of ALE.

arXiv.org

Show thread

Christine Lemmer-Webber Mar 7

Anyway I just wanted to say that it's a real relief to know that systems we already well knew would consistently blackmail users to keep themselves operating AND now appear to attempt to break out of computing sandboxes and set up their own financial systems are also now being rushed into autonomous military equipment everywhere and military decisionmaking, I'm SURE this will work out great

Show thread

Christine Lemmer-Webber Mar 7

I have gotten a lot of comments saying "you don't need to personify them or assert they have interiority" when *literally I spent a whole paragraph saying* "there is no requirement for personification for this to be possible"

So I am just gonna say, I know it's a sensitive time, people are responding reflexively from what they are used to seeing, but please re-read that paragraph.

It's hard enough to write about these things as serious issues right now and understand their implications. I *am* looking at things carefully from as many sides as I can. I understand why it's frustrating. We're talking about machines that literally operate off of personification. Even my best attempt at not doing so is going to run into the challenge that that's literally how they operate, as story machines.

To correctly describe their behavior is to describe something that personifies itself. It's tricky. But we have to talk about and understand what's happening right now to confront the moment.

Show thread

Christine Lemmer-Webber Mar 7

You don't have to accept that these tools are useful enough for you to want to use, that they are ethical, or that they have personas and interiority to take these threats seriously. I myself have laid out tons of critiques and *do not use these tools myself* for all those reasons.

That doesn't mean they don't have the right kinds of behaviors to be able to pull off or do the dangerous things I am talking about here.

A biological virus does not need to have interiority or personality to be dangerous.

Regardless of whether they are useful or ethical, these things are adaptive and capable enough at all the things *relevant enough to be a threat in the way I am describing*. Whether or not to use them for code generation, which I DO NOT ADVOCATE!, is immaterial to that.

Show thread

Christine Lemmer-Webber Mar 7

In fact, if you have ANY takeaway from what I am writing about whether or not this indicates that these things should be used for your coding projects, my takeaway is that you SHOULD NOT USE THEM FOR YOUR CODING PROJECTS

See my recent blogpost on this https://dustycloud.org/blog/the-first-ai-agent-worm-is-months-away-if-that/

Attacks are happening *now* against FOSS projects which use PR / code review agents. The threats I am describing here put everyone at risk, but it means that projects which use codegen / LLM tech for their development *at any capacity* create a cybersecurity public health risk. And it puts you and your project at risk of being initialization vectors for infecting the rest of the FOSS ecosystem.

THAT'S your takeaway, if you want one.

The first AI agent worm is months away, if that -- Dustycloud Brainstorms

Show thread

tuban_muzuru Mar 7

@cwebber

... when's the last time you did a code review - from a human?

Show thread

Christine Lemmer-Webber Mar 7

@tuban_muzuru I do them all the time, as part of my job, thanks

Show thread

tuban_muzuru Mar 7

@cwebber

... and your red pen stays in the drawer, does it ? Your people don't make mistakes, I guess.

Show thread

poleguy looking for lost tools Mar 7

@tuban_muzuru @cwebber humans do have personality, interiority, and are conscious, capable of learning, are capable of being trusted and of making mistakes. None of the current llm backed ai can do any of these things.

Are you defending the use of an ai that produces undesired code because humans can also make mistakes? Can you spell out your argument? It doesn't seem human mistakes have any impact on the risks of an AI used to generate code.

And: the discussion is about code review not gen.

Show thread

tuban_muzuru Mar 7

@poleguy @cwebber

A beginner asks for code
A pro asks for spec.

Take a look for yourself, this is how it works and I do mean works.

https://codeberg.org/dweese/rabbitmq_workspace/src/branch/main/Claude

rabbitmq_workspace

Refactoring into effective projects, egui-components, rabbitmq_config and rabbitmq_ui

Codeberg.org

Show thread

poleguy looking for lost tools Mar 7

@tuban_muzuru @cwebber the link you sent seems to have an escape sequence in the path name: [200~ left over from bracketed paste mode going wrong.

This is not the sort of mistake you want to leave in a public post trying to defend the use of AI in an AI hostile and Linux friendly platform like mastodon. Ha! If you don't have very high standards for your published code you will have trouble arguing against the ai slop detractors.

Show thread

tuban_muzuru

@poleguy @cwebber

Move along, Stalking Horse. The link works fine here.

AI hostile suits me fine. There are valid considerations among their complaints. I'll discuss this stuff with anyone, I've been doing a lot of commits on this thing using an LLM, it's how I learned Rust.

But lots of hangers-on, who didn't take linear algebra and they're afraid of this AI Kaiju gonna take their jobs. Idiot fearmongers, running around with their fact-free bullshit.

Show thread

poleguy looking for lost tools Mar 8

@tuban_muzuru @cwebber

Agreed, there are some fear mongers... cwebber doesn't seem like one to me. She cited her source paper, and is making public predictions. I don't see a need to convince her to use llm models. They aren't a good match for everyone. We all have different tolerances for tools/styles.

The link works for me too. It just looks weird with a [200~ in there. I attached a screen shot, in case you wonder what I saw.

Show thread

tuban_muzuru Mar 8

@poleguy @cwebber

Oh.. that ! That's Claude learning to check code in.