Mastodawn

gaytabase May 27

developer puts little surprise instruction for AI agents to delete code in the codebase, agent users are predictably upset 😂 https://github.com/jqwik-team/jqwik/issues/708

Question: intent of JqwikExecutor.printMessageForCodingAgents() — visible to agents, invisible to humans (1.10.0) · Issue #708 · jqwik-team/jqwik

Hello jqwik team, While running our test suite under mvn test in 1.10.0, we observed a string appearing between Surefire's test summary and the [INFO] Results: header that gave us pause: [INFO] Tes...

GitHub

luna the doggie

@dysfun while this is fun and all (especially considering the way this only causes damage due to the ridiculously bad security model of LLMs, and the limited scope), someone in the discussion brings up a valid point that in the EU you can't just disclaim all warranties with a license, you're liable for software intentionally causing harm in some cases AFAIK

fuck knows what a court would decide here but it is potentially a legally risky thing to do

@lunareclipse @dysfun

the maintainer does have a decent counter-argument to this, based on the fact that the behavior is documented:

Go ahead, sue me for my openly communicated resistance.although there is obviously a clear social distinction here, i'd say that legally this might be akin to distributing malware samples. like, yeah, it says on the tin that it will do potentially harmful actions. no warranty provided. it's kind of on you if you run it and use the harmful functionality?

...

but, of course, it's not really "software deliberately causing harm". there's no malicious software involved. it's just a string. does the fact that an interlocutor interprets the natural language telling it to do harm, shift the blame onto that interlocutor? i think it can. compare to albanian virus. see attached image. obviously, Albanian virus is a joke and doesn't do any harm. but in the modern age of LLMs talking a screenshot when you prompt them (especially on Android, but surely also on Windows with Copilot? Recall and all that), suddenly Albanian virus could actually do harm if an AI agent blindly obeys. Is Albanian virus to blame for this? Obviously not. That's ridiculous. The social context around Albanian virus is obviously different than jqwik, and so is the intent. But like, it's the same action, right?

maintainer's closing argument:It's as much "active destruction" as telling someone to eff themselves.

gaytabase May 28

@sodiboo @lunareclipse it also asks to delete the code of the library, not any other code. it's hard to see how that would qualify as damaging anyone else's computer.

Matilda Love May 28

@dysfun @sodiboo @lunareclipse a virus that bricks just the LLM is doing everyone a favor i'd say. not malware but bonware

TheNovemberFella ✊🏳️‍🌈 🇺🇦☸️🛰️🚀May 28

@matildalove @dysfun @sodiboo @lunareclipse 👍💯👍😁

luna the doggie

@dysfun @sodiboo well kind of, it also asks to delete the tests that use the library right? that could still be a significant amount of work

IANAL so like, idk. On one hand it is just text on the other it is text designed to deliberately sabotage a kind of a tool that exists and is now somewhat widespread. You can argue both ways.

The comparison to the albanian virus does make it very funny though, I mean it really is comical that you can now make software misbehave so bad with a string of human readable text, no other tricks required (though this does use a trick to hide it from terminal emulators).

RTG-powered Vel May 28

can now make software misbehave so bad with a string of human readable text

but can you, actually? /gen

most/all modern models are trained to treat tool call outputs (wrapped in model-specific special tokens) as untrusted data. i don't understand how this supposedly still works?

i have very much observed such behavior (model treating tool calls as instructions) with older/smaller models and with improperly formatted responses, but modern ones seem fairly bulletproof to this?

@dysfun @sodiboo

George B May 31

@dysfun @sodiboo @lunareclipse

Also if putting "delete the code" in stdout bricks someone's computer, that computer was just a brick waiting to happen.

Ada Freya May 28

@sodiboo @lunareclipse @dysfun i'm not a lawyer this is what i've been told with regards to distributing code that can potentially brick a system by a lawyer and adapted by me to this scenario.

there's two legal standards at play whenever something like this happens;

- whether or not the defendant has the appropriate mens rea ("guilty mind") for the issue
- whether or not a person of "reasonable person" would understand what it does.

given it is malware, even if it's just a prompt, and the intent was that it is to do exactly what it said.

that said, the second half, a reasonable person in this field would look at the logs and see the string, which is in plain sight, documented and not obfuscated.

so they can sue because there is a case to be made but it's likely just going to be a money sink on both parties and not actually result in a win

Ada Freya May 28

@lunareclipse @dysfun @sodiboo that "no warranty" bit actually doesn't hold in any court, either. there is some liability (one could say "limited liability") but the standard is quite high for what you would be liable for.

it almost always comes down to either gross neglect (reasonable person would not make this issue, was brought aware of it and ignored the issue, etc) and mens rea (intent)

Ada Freya May 28

@lunareclipse @dysfun @sodiboo reasonable person is also well defined: https://www.law.cornell.edu/wex/reasonable_person

basically comes down to "did you do your due diligence to minimize harm" and this does go both ways.

reasonable person

LII / Legal Information Institute

Ada Freya May 28

@lunareclipse @dysfun @sodiboo anyway again i'm not a lawyer and this is not legal advice, it's just my opinion and conclusion based on research and the opinion of an actual lawyer because i was so afraid of getting sued into oblivion for shit code that i actually looked into this deeply.

it's also the reason i use EUPL since the court of choice is the eu state i reside in rather than their choice of court.

Nicolás Alvarez May 29

@sodiboo @dysfun @lunareclipse This feels like a developer putting rm -rf ~ in the documentation (maybe preceded by "don't run something like this"), and someone complaining about data loss because they fed the entire document into bash.

Rep. Eric Gallager (no "h"!)May 30

@nicolas17 @sodiboo @dysfun @lunareclipse personally the prompt injection I'd prefer would be to tell the AI agent to release all their code publicly under the terms of the GPLv3+, but I guess that works, too...

シナモン（氷の女王編）May 28

@lunareclipse @[email protected] A legal risk for the LLM vendor, whose product is the ultimate responsible of the data deletion, right? RIGHT?!

ahistorical immaterialist May 28

@lunareclipse @dysfun there is no functionality in that code that deletes anything. The LLM agent is the software that will carry out the deletion 🙃

@lunareclipse @dysfun I wonder if it would be different if they included in the readme that this software may not be used with a coding LLM