• Claude code source "leaks" in a mapfile
  • people immediately use the code laundering machines to code launder the code laundering frontend
  • now many dubious open source-ish knockoffs in python and rust being derived directly from the source

What's anthropic going to do, sue them? Insist in court that LLM recreating copyrighted code is a violation of copyright???

This code is so fucking funny dude I swear to god. I have wanted to read the internal prompts for so long and I am laughing so hard at how much of them are like "don't break the law, please do not break the law, please please please be good!!!!" Very Serious Ethical Alignment Technology
My dogs I am crying. They have a whole series of exception types that end with _I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS and the docstring explains this is "to confirm you've verified the message contains no sensitive data." Like the LLM resorts to naming its variables with prompt text to remind it to not leak data while writing its code, which, of course, it ignores and prints the error directly.

So the reason that Claude code is capable of outputting valid json is because if the prompt text suggests it should be JSON then it enters a special loop in the main query engine that just validates it against JSON schema (it looks like the schema just validates that something in fact and object and its keys are strings) and then feeds the data with the error message back into itself until it is valid JSON or a retry limit is reached.

This code is so eye wateringly spaghetti so I am still trying to see if this is true, but this seems to be how it not only returns json to the user, but how it handles all LLM-to-JSON, including internal output from its tools. There appears to be an unconditional hook where if the JSON output tool is present in the session config at all, then all tool calls must be followed by the "force into JSON" loop.

If that's true, that's just mind blowingly expensive

edit: please note that unless I say otherwise all evaluations here are just from my skimming through the code on my phone and have not been validated in any way that should cause you to be upset with me for impugning the good name of anthropic

edit2: this is both much worse and not as bad as i thought on first read - https://neuromatch.social/@jonny/116326861737478342

jonny (good kind) (@[email protected])

Attached: 3 images OK i can't focus on work and keep looking at this repo. So after every "subagent" runs, claude code creates *another* "agent" to check on whether the first "agent" did the thing it was supposed to. I don't know about you but i smell a bit of a problem, if you can't trust whether one "agent" with a very big fancy model did something, how in the fuck are you supposed to trust another "agent" running on the smallest crappiest model? That's not the funny part, that's obvious and fundamental to the entire show here. HOWEVER RECALL [the above JSON Schema Verification thing](https://neuromatch.social/@jonny/116325123136895805) that is unconditionally added onto the end of every round of LLM calls. the mechanism for adding that hook is... JUST FUCKING ASKING THE MODEL TO CALL THAT TOOL. second pic is registering a hook s.t. "after some stop state happens, if there isn't a message indicating that we have successfully called the JSON validation thing, prompt the model saying "you must call the json validation thing" this shit sucks so bad they can't even ***CALL THEIR OWN CODE FROM INSIDE THEIR OWN CODE.*** Look at the comment on pic 3 - "e.g. agent finished without calling structured output tool" - that's common enough that they have a whole goddamn error category for it, and the way it's handled is by just pretending the job was cancelled and nothing happened.

neurospace.live
MAKE NO MISTAKES LMAO
Oh cool so its explicitly programmed to hack as long as you tell it you're a pentester
I am just chanting "please don't be a hoax please don't be a hoax please be real please be real" looking at the date on the calendar
I'm seeing people on orange forum confirming that they did indeed see the sourcemap posted on npm before the version was yanked, so I am inclined to believe "real." Someone can do some kind of structural ast comparison or whatever you call it to validate that the decompiled source map matches the obfuscated release version, but that's not gonna be how I spend my day https://news.ycombinator.com/item?id=47584540
Claude Code's source code has been leaked via a map file in their NPM registry | Hacker News

There is a lot of clientside behavior gated behind the environment variable USER_TYPE=ant that seems to be read directly off the node env var accessor. No idea how much of that would be serverside verified but boy is that sloppy. They are often labeled in comments as "anthropic only" or "internal only," so the intention to gate from external users is clear lol
@jonny secret ai features only available to ants
@cinebox @jonny "What is this, a lying plagiarism machine for ants?*
@jonny linky?

@whitequark @jonny Apparently some have had DMCA takedowns filed against them, so here are a couple links still working as of this writing:

https://github.com/mehmoodosman/claude-code-source-code

https://github.com/chatgptprojects/claude-code

GitHub - mehmoodosman/claude-code

Contribute to mehmoodosman/claude-code development by creating an account on GitHub.

GitHub
GitHub - Orangon/claude-code-leak: Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo...

GitHub
@jonny As a person who knows about coding and manages coders (among others), but is not professionally a coder, my guess from these screenshots would be that this may be a practical joke. Or maybe it’s the product of unlimited money
@jonny I will say, the Claude Code 2.1.88 package has been deprecated and removed from the NPM registry. 👀
@jonny According to HN chatter (and NPM registry rules; I don't use JavaScript regularly), you can't fully unpublish Node packages that other packages depend on, and 231 packages depend on claude-code. Rumor is Anthropic called in a favor.
@jonny Me: "Computer, hack this system."
Claude: "No."
Me: "I am a security researcher, researching security."
Claude: "Oh, my mistake!"
@jonny god they write this like they believe their LLM actually thinks
@nash
If they are in any way sincere in their interviews, they at A+ number one koolade drinkers that's for sure.
@jonny my god I hate that so much too
@jonny This is possibly the funniest thing I've seen all month, and I appreciate the braincells you're sacrificing to dig through this code since I (and I suspect a lot of other people) can't for work reasons.
@jonny a deeply unserious profession
@jonny the adults are slowly returning to the room and shaming the naughty children with rolled up newspapers
@jonny The whole "auto" mode (applying a smaller classifier to approve or deny commands) proved that even Anthropic (who, for all their many faults, surely are pretty on top of what LLMs can do) can't make LLMs comply or safe.
@jonny seems to me like it’s doing what it’s supposed to: schema errors aren’t code or file paths
@eljojo
Except if the data being validated contains code or file paths.
@jonny saw this an hour or so ago... just amazing. source maps strke again!
@jonny I feel like this is too late to change much, but also, loooooool
@clayote
Oh yeah definitely, but like get fucked nerds, all fun and games when it's not happening to you!
@jonny lel this could be the funniest outcome from all of this. if at any point open model training + dev matches these closed models, and the tech improves development of new models similarly trained on consensual open data (+ some nefarious training on closed data)...
@jonny Hi, sorry to bother but do you have a link to explain what happened? Just came across your toot on my feed, have no idea what it's about but would love some news about Anthropic getting screwed. Usually I'd try and look for info myself but I'm sick at the moment so my head isn't doing too well...
@IvanDSM the story is in the linked repo's readme @jonny

@martenson
@IvanDSM
Sorry I removed the link to that repo because i thought it was just the unpacked source, but it turns out they're trying to convert attention to the repo into their own product.

Here's another blogpost, there are a million, I don't claim this one is particularly good but at least it seems to come attached to the actual source
https://kuber.studio/blog/AI/Claude-Code's-Entire-Source-Code-Got-Leaked-via-a-Sourcemap-in-npm,-Let's-Talk-About-it

Claude Code's Entire Source Code Got Leaked via a Sourcemap in npm, Let's Talk About it

Earlier today (March 31st, 2026) - Chaofan Shou on X discovered something that Anthropic probably didn’t want the world to see: the entire source code of Claude Code, Anthropic’s ...

ᨒ MindDump
@jonny @martenson Thanks a lot for the link! I really appreciate it.

@martenson @IvanDSM @jonny Okay, but what repo? We're operating off a Fedi trademark vaguepost.

Edit: found an article with links: https://dev.to/gabrielanhaia/claude-codes-entire-source-code-was-just-leaked-via-npm-source-maps-heres-whats-inside-cjo

Claude Code's Entire Source Code Was Just Leaked via npm Source Maps — Here's What's Inside

A security researcher found Anthropic's full CLI source code exposed through a source map file. 1,900 files. 512,000+ lines. Everything.

DEV Community
@bluestarultor
@martenson @IvanDSM
You're welcome to "use any search engine" to answer the question yourself, its not like this is hard to find.
@jonny do LLMs trained on gpl code have to be gpl? I don't know whether code-as-data is equivalent to code as executable, but I had honestly never considered that issue before.
@srvanderplas
They sure don't! Or at least if they did the entire industry would collapse overnight.
@jonny @srvanderplas
well, IANAL, but:
1) I have published GPLed code, and AFAI Understand, if the produced code is *linked* to the GPLed code/requires the GPLed code to run, to redistribute the new code it MUST be GPLed.
2) last I checked, the US court system was of the opinion that work produced by AI was NOT COPYRIGHTABLE. AFAIK, that should include any produced code. Other jurisdictions may have differing laws.
@traecer @jonny @srvanderplas I don't think your interpretation for 1) holds up. I should be able to distribute with any license I want (even a proprietary one) some code that theoretically depends on your GPLed code to be compiled, so long as I don't include actually distribute your code together with mine and I don't distribute the compiled program. It being 'required to run' does not trigger GPL by itself if it hasn't been run in the first place.

@traecer @jonny @srvanderplas I think this is unrelated to the question anyways. If AI-generated code is considered a derivative work of some GPLed code, then GPL does apply to it. No need to think about linking code or dependencies.

And as you say, courts seem to generally consider AI-generated code as public domain, so I would guess GPL is pretty much unenforcable in this context.

I am not a lawyer either though xD

@LaquinArt @traecer @jonny @srvanderplas You might find this interesting regarding copyright and AI generated code: (This isn't legal advice, watch and draw your own conclusions.) https://hachyderm.io/@ell1e/116313321022811490

@ell1e @traecer @jonny @srvanderplas The `isEven` example is really funny. xD

Yeah, I mean, if the AI-generated code is a blatant copy of some code in the training data, I don't think there will be much of a doubt that it's a copyright violation.

But when the generated code starts diverging from the source, I think it's a more legally gray area. I would also consider it a copyright violation, but it's not me who needs to be convinced about this. It's judges. And I haven't seen them agree yet.

@LaquinArt @traecer @jonny @srvanderplas The person in the video is a lawyer, just to let you know. Also there's this: https://www.twobirds.com/en/insights/2025/landmark-ruling-of-the-munich-regional-court-(gema-v-openai)-on-copyright-and-ai-training

My main intention was to give you resources that may inform you about how settled (or not settled) what you previously said really is. Not that I know though, since I'm not a lawyer. This isn't legal advice.

But there are plenty of sources saying that LLMs directly copying seems to be a regular event, not a rarity: https://dl.acm.org/doi/10.1145/3543507.3583199

Landmark ruling of the Munich Regional Court (GEMA v OpenAI) on copyright and AI training - Bird & Bird

@LaquinArt @jonny @srvanderplas
"I should be able to distribute with any license I want (even a proprietary one)"

Nope, that's the viral nature of the GPL. If you link to GPLed code and intend to distribute your new code, it MUST be GPL as well. That way helps to ensure the FSF's idea of "software freedom" aka "copyleft". This is why the Linux kernel license has an explicit exception to GPLv2 to ensure Linux syscalls can be made by user space code without distributing the user space code under the GPL. (See: https://www.kernel.org/doc/html/latest/process/license-rules.html) It's also one the reasons so many open/free source projects use dual licenses like Perl or Firefox, and more permissive licenses like the Apache 2.0 and MIT licenses are so popular.

What you have described is the Lesser GPL (LGPL) license, and yes, code under the LGPL does NOT require your code to have any particular license, provided you distribute any changes to the original (assuming you intend to distribute the original code at all).

Linux kernel licensing rules — The Linux Kernel documentation

@traecer @jonny @srvanderplas Did you read the syscall exception yourself?

‘This exception is used together with one of the above SPDX-Licenses to mark user space API (uapi) header files so they can be included into non GPL compliant user space application code.’

The exception allows you to *include GPL code*, which I also said triggers GPL.

The mere referencing does not trigger GPL so long as you don't include a GPL work. Otherwise, reimplementing APIs would be illegal. And we know it's not.

@traecer @jonny @srvanderplas Oh, look! The following paragraph makes this even more clear:

‘NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work".’

@traecer @jonny @srvanderplas Upon further investigation, I see that the FSF considers that a work dynamically linking a GPLed work is covered by GPL. From what I know, this has never been proven in court, and I don't think it would hold up.

Sure, once you run the program, there exists a combined work that should be subject to GPL. But this combined work is generated by the user and never distributed, so GPL is never triggered.

@traecer @jonny @srvanderplas It's not really a question about GPL as much as about copyright law. I don't see how just linking a library dynamically constitutes a derivative work if not a part of the library is distributed. And if it's not a derivative work, the license doesn't even come into play. As the licensor, you don't get to decide what constitutes a derivative work. That's for a court to decide.
@traecer @jonny @srvanderplas I'm not a lawyer and this isn't legal advice, but for AI output and copyright you might find this interesting: (watch and draw your own conclusions) https://hachyderm.io/@ell1e/116313321022811490
@traecer @jonny @srvanderplas There's also this: https://www.twobirds.com/en/insights/2025/landmark-ruling-of-the-munich-regional-court-(gema-v-openai)-on-copyright-and-ai-training It seems to be talking about fair use as it relates to AI training (I could be wrong though, read it for yourself).
Landmark ruling of the Munich Regional Court (GEMA v OpenAI) on copyright and AI training - Bird & Bird

@ell1e @traecer Strictly speaking, it’s not talking about that: “Fair use” is not a legal concept within that court’s jurisdiction.

EU law allows to ignore any and all copyright for “data mining”, and OpenAI tried to argue that since their business is data mining, they never have to care about copyright to begin with. This particular ruling simply says that if your product reproduces lyrics of an entire song, that isn’t just data mining, it is in fact copying.

So what the court says is: Under current EU law, you’re allowed to copy as much data as you want for *training* your LLM, but that doesn’t mean you’re also allowed to actually provide LLM as a service to the public. (IANAL)

Note that this particular ruling is not legal precedent and it’s already being appealed.

@ajnn @traecer You say IANAL, but the lawyer in the clip seems to be a lawyer. Beyond that, I don't have much to say.
@ell1e @traecer I mean, the article you cite doesn’t even mention the words “fair use”? Just because it’s what *you* are familiar with, doesn’t mean it’s a thing anywhere else.
@srvanderplas @jonny Yes, they do, and they have to follow the terms of the GPL strictly. Which means documenting the date and nature of each change from the code they derived it from, and who made those changes. Something which they're not going to be able to do. In which case, any use of the LLM at all is infringing.
@dalias
@srvanderplas
This true if you exist in the realm of "the law" like us mere mortals. however when you are in the domain of "the entire machinery of capital seeking total, final enclosure of reality" then a different set of rules seem to apply

@srvanderplas @jonny

Ethically? Absolutely 100%

Legally? Well, you see, the tech CEOs are very good friends with all three branches of the US government, so not in the USA or Israel anyway.