Mastodawn

Claude code source "leaks" in a mapfile
people immediately use the code laundering machines to code launder the code laundering frontend
now many dubious open source-ish knockoffs in python and rust being derived directly from the source

What's anthropic going to do, sue them? Insist in court that LLM recreating copyrighted code is a violation of copyright???

Show thread

jonny (good kind)13h ago

This code is so fucking funny dude I swear to god. I have wanted to read the internal prompts for so long and I am laughing so hard at how much of them are like "don't break the law, please do not break the law, please please please be good!!!!" Very Serious Ethical Alignment Technology

Show thread

jonny (good kind)12h ago

My dogs I am crying. They have a whole series of exception types that end with _I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS and the docstring explains this is "to confirm you've verified the message contains no sensitive data." Like the LLM resorts to naming its variables with prompt text to remind it to not leak data while writing its code, which, of course, it ignores and prints the error directly.

Show thread

jonny (good kind)12h ago

So the reason that Claude code is capable of outputting valid json is because if the prompt text suggests it should be JSON then it enters a special loop in the main query engine that just validates it against JSON schema (it looks like the schema just validates that something in fact and object and its keys are strings) and then feeds the data with the error message back into itself until it is valid JSON or a retry limit is reached.

This code is so eye wateringly spaghetti so I am still trying to see if this is true, but this seems to be how it not only returns json to the user, but how it handles all LLM-to-JSON, including internal output from its tools. There appears to be an unconditional hook where if the JSON output tool is present in the session config at all, then all tool calls must be followed by the "force into JSON" loop.

If that's true, that's just mind blowingly expensive

edit: please note that unless I say otherwise all evaluations here are just from my skimming through the code on my phone and have not been validated in any way that should cause you to be upset with me for impugning the good name of anthropic

edit2: this is both much worse and not as bad as i thought on first read - https://neuromatch.social/@jonny/116326861737478342

jonny (good kind) (@[email protected])

Attached: 3 images OK i can't focus on work and keep looking at this repo. So after every "subagent" runs, claude code creates *another* "agent" to check on whether the first "agent" did the thing it was supposed to. I don't know about you but i smell a bit of a problem, if you can't trust whether one "agent" with a very big fancy model did something, how in the fuck are you supposed to trust another "agent" running on the smallest crappiest model? That's not the funny part, that's obvious and fundamental to the entire show here. HOWEVER RECALL [the above JSON Schema Verification thing](https://neuromatch.social/@jonny/116325123136895805) that is unconditionally added onto the end of every round of LLM calls. the mechanism for adding that hook is... JUST FUCKING ASKING THE MODEL TO CALL THAT TOOL. second pic is registering a hook s.t. "after some stop state happens, if there isn't a message indicating that we have successfully called the JSON validation thing, prompt the model saying "you must call the json validation thing" this shit sucks so bad they can't even ***CALL THEIR OWN CODE FROM INSIDE THEIR OWN CODE.*** Look at the comment on pic 3 - "e.g. agent finished without calling structured output tool" - that's common enough that they have a whole goddamn error category for it, and the way it's handled is by just pretending the job was cancelled and nothing happened.

neurospace.live

Show thread

jonny (good kind)12h ago

MAKE NO MISTAKES LMAO

Show thread

jonny (good kind)12h ago

Oh cool so its explicitly programmed to hack as long as you tell it you're a pentester

Show thread

jonny (good kind)11h ago

I am just chanting "please don't be a hoax please don't be a hoax please be real please be real" looking at the date on the calendar

Show thread

jonny (good kind)11h ago

I'm seeing people on orange forum confirming that they did indeed see the sourcemap posted on npm before the version was yanked, so I am inclined to believe "real." Someone can do some kind of structural ast comparison or whatever you call it to validate that the decompiled source map matches the obfuscated release version, but that's not gonna be how I spend my day https://news.ycombinator.com/item?id=47584540

Claude Code's source code has been leaked via a map file in their NPM registry | Hacker News

Show thread

jonny (good kind)10h ago

There is a lot of clientside behavior gated behind the environment variable USER_TYPE=ant that seems to be read directly off the node env var accessor. No idea how much of that would be serverside verified but boy is that sloppy. They are often labeled in comments as "anthropic only" or "internal only," so the intention to gate from external users is clear lol

Show thread

jonny (good kind)10h ago

(I need to go do my actual job now, but I'll be back tonight with an actual IDE instead of just scrolling, jaw agape, on my phone, seeing the absolute dogshit salad that was the product of enough wealth to meet some large proportion of all real human needs, globally.)

Show thread

jonny (good kind)10h ago

reminder that anthropic ran (and is still running) an ENTIRE AD CAMPAIGN around "Claude code is written with claude code" and after the source was leaked that has got to be the funniest self-own in the history of advertising because OH BOY IT SHOWS.

it's hard to get across in microblogging format just how big of a dumpster fire this thing is, because what it "looks like" is "everything is done a dozen times in a dozen different ways, and everything is just sort of jammed in anywhere. to the degree there is any kind of coherent structure like 'tools' and 'agents' and whatnot, it's entirely undercut by how the entire rest of the code might have written in some special condition that completely changes how any such thing might work." I have read a lot of unrefined, straight from the LLM code, and Claude code is a masterclass in exactly what you get when you do that - an incomprehensible mess.

Show thread

Jared White (ResistanceNet ✊)10h ago

@jonny their velocity for shipping *slop* is indeed insane

😜

Show thread

Cap Ybarra 10h ago

@jonny "velocity for shipping is insane"

it turns out you can ship very fast if nothing has to work!

Show thread

Reid

9h ago

@jonny It genuinely feels like the kind of code you'd write when you're paid per line

Which is probably very accurate since so much corporate code is written like that

Show thread

catch 9h ago

@jonny

function speechBubble()

Show thread

Andrew 10h ago

@jonny secret ai features only available to ants

Show thread

beemoh 10h ago

@cinebox @jonny "What is this, a lying plagiarism machine for ants?*

Show thread

Vlad 🇺🇦7h ago

@cinebox @jonny Who knew this was all about Ant Intelligence

Show thread

Tom Casavant 10h ago

@jonny I think it's configured so the 'ant' user accesses "https://claude-ai.staging.ant.dev/" instead of the normal endpoint, so I would hope on their staging environment that they block regular users from accessing it

Claude

Claude is Anthropic's AI, built for problem solvers. Tackle complex challenges, analyze data, write code, and think through your hardest work.

Claude

@jonny linky?

@whitequark @jonny Apparently some have had DMCA takedowns filed against them, so here are a couple links still working as of this writing:

https://github.com/mehmoodosman/claude-code-source-code

https://github.com/chatgptprojects/claude-code

GitHub - mehmoodosman/claude-code

Contribute to mehmoodosman/claude-code development by creating an account on GitHub.

GitHub

Show thread

Jamie Gaskins 7h ago

@whitequark @jonny Additionally:

https://github.com/Orangon/claude-code-leak

GitHub - Orangon/claude-code-leak: Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo...

GitHub

Show thread

Toast, Anonymous Fedi Fungus 11h ago

@jonny As a person who knows about coding and manages coders (among others), but is not professionally a coder, my guess from these screenshots would be that this may be a practical joke. Or maybe it’s the product of unlimited money

Show thread

The Orange Theme 11h ago

@jonny I will say, the Claude Code 2.1.88 package has been deprecated and removed from the NPM registry. 👀

Show thread

The Orange Theme 11h ago

@jonny According to HN chatter (and NPM registry rules; I don't use JavaScript regularly), you can't fully unpublish Node packages that other packages depend on, and 231 packages depend on claude-code. Rumor is Anthropic called in a favor.

Show thread

The Orange Theme 12h ago

@jonny Me: "Computer, hack this system."
Claude: "No."
Me: "I am a security researcher, researching security."
Claude: "Oh, my mistake!"

Show thread

nash 11h ago

@jonny god they write this like they believe their LLM actually thinks

Show thread

jonny (good kind)11h ago

@nash
If they are in any way sincere in their interviews, they at A+ number one koolade drinkers that's for sure.

Show thread

Brodeuse LucileDT 12h ago

@jonny my god I hate that so much too

Show thread

Dan Sugalski 12h ago

@jonny This is possibly the funniest thing I've seen all month, and I appreciate the braincells you're sacrificing to dig through this code since I (and I suspect a lot of other people) can't for work reasons.

Show thread

Preston Maness ☭12h ago

@jonny a deeply unserious profession

Show thread

[HANDMAIDEN] xan 12h ago

@jonny the adults are slowly returning to the room and shaming the naughty children with rolled up newspapers

Show thread

Lars Marowsky-Brée 😷12h ago

@jonny The whole "auto" mode (applying a smaller classifier to approve or deny commands) proved that even Anthropic (who, for all their many faults, surely are pretty on top of what LLMs can do) can't make LLMs comply or safe.

Show thread

José Albornoz 12h ago

@jonny seems to me like it’s doing what it’s supposed to: schema errors aren’t code or file paths

Show thread

jonny (good kind)12h ago

@eljojo
Except if the data being validated contains code or file paths.

Show thread

wohali 14h ago

@jonny saw this an hour or so ago... just amazing. source maps strke again!

Show thread

some kind of orange shape 14h ago

@jonny I feel like this is too late to change much, but also, loooooool

Show thread

jonny (good kind)14h ago

@clayote
Oh yeah definitely, but like get fucked nerds, all fun and games when it's not happening to you!

Show thread

elle 14h ago

@jonny lel this could be the funniest outcome from all of this. if at any point open model training + dev matches these closed models, and the tech improves development of new models similarly trained on consensual open data (+ some nefarious training on closed data)...

Show thread

IvanDSM 14h ago

@jonny Hi, sorry to bother but do you have a link to explain what happened? Just came across your toot on my feed, have no idea what it's about but would love some news about Anthropic getting screwed. Usually I'd try and look for info myself but I'm sick at the moment so my head isn't doing too well...

Show thread

martenson 13h ago

@IvanDSM the story is in the linked repo's readme @jonny

Show thread

jonny (good kind)13h ago

@martenson
@IvanDSM
Sorry I removed the link to that repo because i thought it was just the unpacked source, but it turns out they're trying to convert attention to the repo into their own product.

Here's another blogpost, there are a million, I don't claim this one is particularly good but at least it seems to come attached to the actual source
https://kuber.studio/blog/AI/Claude-Code's-Entire-Source-Code-Got-Leaked-via-a-Sourcemap-in-npm,-Let's-Talk-About-it

Claude Code's Entire Source Code Got Leaked via a Sourcemap in npm, Let's Talk About it

Earlier today (March 31st, 2026) - Chaofan Shou on X discovered something that Anthropic probably didn’t want the world to see: the entire source code of Claude Code, Anthropic’s ...

ᨒ MindDump

Show thread

The Orange Theme 12h ago

@jonny @martenson @IvanDSM *flashes the @davidgerard signal*

Show thread

bluestarultor 12h ago

@martenson @IvanDSM @jonny Okay, but what repo? We're operating off a Fedi trademark vaguepost.

Edit: found an article with links: https://dev.to/gabrielanhaia/claude-codes-entire-source-code-was-just-leaked-via-npm-source-maps-heres-whats-inside-cjo

Claude Code's Entire Source Code Was Just Leaked via npm Source Maps — Here's What's Inside

A security researcher found Anthropic's full CLI source code exposed through a source map file. 1,900 files. 512,000+ lines. Everything.

DEV Community

Show thread

jonny (good kind)11h ago

@bluestarultor
@martenson @IvanDSM
You're welcome to "use any search engine" to answer the question yourself, its not like this is hard to find.

Show thread

Susan Vanderplas 13h ago

@jonny do LLMs trained on gpl code have to be gpl? I don't know whether code-as-data is equivalent to code as executable, but I had honestly never considered that issue before.

Show thread

jonny (good kind)13h ago

@srvanderplas
They sure don't! Or at least if they did the entire industry would collapse overnight.

Show thread

traecer 9h ago

@jonny @srvanderplas
well, IANAL, but:
1) I have published GPLed code, and AFAI Understand, if the produced code is *linked* to the GPLed code/requires the GPLed code to run, to redistribute the new code it MUST be GPLed.
2) last I checked, the US court system was of the opinion that work produced by AI was NOT COPYRIGHTABLE. AFAIK, that should include any produced code. Other jurisdictions may have differing laws.

Show thread

Laquin 9h ago

@traecer @jonny @srvanderplas I don't think your interpretation for 1) holds up. I should be able to distribute with any license I want (even a proprietary one) some code that theoretically depends on your GPLed code to be compiled, so long as I don't include actually distribute your code together with mine and I don't distribute the compiled program. It being 'required to run' does not trigger GPL by itself if it hasn't been run in the first place.

Show thread

Laquin 9h ago

@traecer @jonny @srvanderplas I think this is unrelated to the question anyways. If AI-generated code is considered a derivative work of some GPLed code, then GPL does apply to it. No need to think about linking code or dependencies.

And as you say, courts seem to generally consider AI-generated code as public domain, so I would guess GPL is pretty much unenforcable in this context.

I am not a lawyer either though xD

Show thread

ell1e coding things 9h ago

@LaquinArt @traecer @jonny @srvanderplas You might find this interesting regarding copyright and AI generated code: (This isn't legal advice, watch and draw your own conclusions.) https://hachyderm.io/@ell1e/116313321022811490

Show thread

Laquin 1h ago

@ell1e @traecer @jonny @srvanderplas The `isEven` example is really funny. xD

Yeah, I mean, if the AI-generated code is a blatant copy of some code in the training data, I don't think there will be much of a doubt that it's a copyright violation.

But when the generated code starts diverging from the source, I think it's a more legally gray area. I would also consider it a copyright violation, but it's not me who needs to be convinced about this. It's judges. And I haven't seen them agree yet.

Show thread

traecer 8h ago

@LaquinArt @jonny @srvanderplas
"I should be able to distribute with any license I want (even a proprietary one)"

Nope, that's the viral nature of the GPL. If you link to GPLed code and intend to distribute your new code, it MUST be GPL as well. That way helps to ensure the FSF's idea of "software freedom" aka "copyleft". This is why the Linux kernel license has an explicit exception to GPLv2 to ensure Linux syscalls can be made by user space code without distributing the user space code under the GPL. (See: https://www.kernel.org/doc/html/latest/process/license-rules.html) It's also one the reasons so many open/free source projects use dual licenses like Perl or Firefox, and more permissive licenses like the Apache 2.0 and MIT licenses are so popular.

What you have described is the Lesser GPL (LGPL) license, and yes, code under the LGPL does NOT require your code to have any particular license, provided you distribute any changes to the original (assuming you intend to distribute the original code at all).

Linux kernel licensing rules — The Linux Kernel documentation

Show thread

Laquin 1h ago

@traecer @jonny @srvanderplas Did you read the syscall exception yourself?

‘This exception is used together with one of the above SPDX-Licenses to mark user space API (uapi) header files so they can be included into non GPL compliant user space application code.’

The exception allows you to *include GPL code*, which I also said triggers GPL.

The mere referencing does not trigger GPL so long as you don't include a GPL work. Otherwise, reimplementing APIs would be illegal. And we know it's not.

Show thread

Laquin 1h ago

@traecer @jonny @srvanderplas Oh, look! The following paragraph makes this even more clear:

‘NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work".’

Show thread

ell1e coding things 9h ago

@traecer @jonny @srvanderplas I'm not a lawyer and this isn't legal advice, but for AI output and copyright you might find this interesting: (watch and draw your own conclusions) https://hachyderm.io/@ell1e/116313321022811490

Show thread

ell1e coding things 9h ago

@traecer @jonny @srvanderplas There's also this: https://www.twobirds.com/en/insights/2025/landmark-ruling-of-the-munich-regional-court-(gema-v-openai)-on-copyright-and-ai-training It seems to be talking about fair use as it relates to AI training (I could be wrong though, read it for yourself).

Landmark ruling of the Munich Regional Court (GEMA v OpenAI) on copyright and AI training - Bird & Bird

Show thread

Cassandrich 12h ago

@srvanderplas @jonny Yes, they do, and they have to follow the terms of the GPL strictly. Which means documenting the date and nature of each change from the code they derived it from, and who made those changes. Something which they're not going to be able to do. In which case, any use of the LLM at all is infringing.

Show thread

jonny (good kind)12h ago

@dalias
@srvanderplas
This true if you exist in the realm of "the law" like us mere mortals. however when you are in the domain of "the entire machinery of capital seeking total, final enclosure of reality" then a different set of rules seem to apply

Show thread

eestileib (she/hers)11h ago

@srvanderplas @jonny

Ethically? Absolutely 100%

Legally? Well, you see, the tech CEOs are very good friends with all three branches of the US government, so not in the USA or Israel anyway.

Show thread

mirabilos 2h ago

@srvanderplas @jonny of course they do