Mastodawn

Aral Balkan 1d ago

So Anthropic employees are using Claude Code to contribute AI-generated code to open source repositories and hiding the fact using their own internal “undercover mode”.

Totally trustworthy people.

(Any open source project that at the very least requires disclosure of AI-authored contributions should immediately ban Anthropic employees on principle.)

#AI #Anthropic #ClaudeCode #subterfuge

Mx. Aria Stewart 1d ago

@aral Honestly I don't actually hate this.

It's a tool. The _user_ is responsible for what they're submitting. It's putting code generated by them in their name. I think this is actually good.

@aredridel @aral I really can’t agree with this, because it’s a question of accurate labeling not of “responsibility” or “authorship”. co-authored-by is perhaps the wrong method for labeling such things, but consider raw milk. ultimately, it is indeed the producer’s responsibility to ensure their product is free of contamination. but disclosure of its method of production is explicitly the kind of requirement that allows consumers of said product to make safe choices

Mx. Aria Stewart 1d ago

@glyph Yeah, I disagree. Code isn't ingredients and it's not “contamination" any more than you should label “I used search and replace on this”

What you want to know is whether it was well engineered or not.

And in fact, this is almost entirely orthogonal to "safety”. This is an engineering product. The safety comes from processes and whether or not _anyone checked the work done was right_, not the inputs.

@aredridel "raw milk" isn't ingredients either, the difference is one of process, which is why I used it as an example. Raw milk contamination is more likely because the processes to keep it safe are harder to follow, require more continuous diligence on the part of the operators of that process, and thus contribute to more frequent failures. LLM output is exactly the same: it provokes vigilance decay.

@aredridel "search and replace" is not a fair comparison because search and replace does *not* cause vigilance decay, or risk of unknowing copyright infringement, etc. in the same way that "raw milk" and "grass fed" are just like… completely different disclosures with different consequential implications

Mx. Aria Stewart 1d ago

@glyph Actually search and replace _does_ do that and in fact I was bit by vigilance decay in a search and replace problem literally yesterday. the comparison was intended.

@aredridel you are technically correct here (and indeed any automated tool with repeated human interaction my provoke _some_ measure of vigilance decay, one could argue that "flaky tests" cause it too) but I feel like you're talking past the actual argument here.

Mx. Aria Stewart 1d ago

@glyph I'm specifically arguing that it's the _exact same phenomenon writ larger_ (which is a meaningful difference!)

But it's a difference in amount not kind.

Either you build processes to check things ("do engineering") or you don't (“vibes”)

@aredridel There are scales where differences in degree _become_ differences in kind.

Consider a more closely related phenomenon. There are many tools to check C/C++ code for memory safety errors. And, unsafe Rust code may exhibit exactly the same unsafe behaviors. Yet, C/C++ code and Rust code are categorically different in terms of the level of memory safety one may expect them to provide.

@aredridel Here we have an established "engineering" process, i.e. code review and continuous integration, designed for catching defects and process failures from a good-faith production of code from humans with an understanding of the system under development. That process is then subjected to a new type of code generation, where a machine that *maximizes plausibility while minimizing effort*, is throwing much larger volumes of code against the same mechanism. That's not the same process!

@aredridel The human being sitting there typing the code out with their fingers was an *implied* initial check on the process—arguably the largest one by far—which you've now thrown out in favor of someone hitting '1 1 1 1 2' in a Claude Code loop, putting a _far_ more load-bearing role onto the existing CI and the code reviewer. More importantly, in this context, it has been thrown out *implicitly* by an Anthropic employee testing a *beta* version of the model

Mx. Aria Stewart 1d ago

@glyph Right. So _if the PR is bad, reject it_.

If it's not, don't.

And if you didn't check WHY NOT?

@aredridel This is the same logic as "if you don't want to have segfaults in your C code, just check more carefully. why did you put the bugs in, if you don't want bugs?"

No process is perfect, nothing can catch everything. Guard rails are important but you aren't supposed to start *driving on the guard rails* all the time. Step zero here is honest and accurate labeling of one's methods. Which is what this thread is about: inherent, structural, software-supported dishonesty

Mx. Aria Stewart 1d ago

@glyph Right. Are you measuring your guardrails?

And: do you require any unsafe practice to be labeled? Or just LLMs?

That's the thing. My fundamental argument here is that _these are tools_. Sometimes that's relevant, sometimes that's not.

> Are you measuring your guardrails?

Of course not. Nobody is. The resources do not exist in the software industry, let alone in volunteer open source, to do this adequately. Which is why we rely on good faith.

> do you require any unsafe practice to be labeled? Or just LLMs?

Just LLMs. First, because LLMs are novel and unique.

Second, here we're not even talking about a labeling *requirement* yet, we're talking about *active deception*.

@aredridel Treating LLMs differently here is not a double standard, it's just a standard. They're new, they're different, but most of all, if labeling weren't a big deal *why try to hide it in the first place*?

Mx. Aria Stewart 1d ago

@glyph So what's going on in Claude (which fwiw I do not use) is a lot of "don't expose unreleased product info”

Not _great_ mind you but that's a lot of the context for what's going on there.

@aredridel In the prompt under discussion here, "generated with claude code" is included in the list of things not to include, which is not an unreleased product name.

@aredridel like, "unreleased product info" is _one_ of the things here, but the prompt is quite explicit about being deceptive about being an AI tool at all.

@glyph @aredridel

While personally, my thinking is more aligned with Glyph's I wanted to say thank you both for such a discussion. I appreciate both of your arguments and the way you are discussing them.

Thank you!