I have what may be a very ignorant question: if model-generated code may not be copyrighted due to a requirement of human authorship (current US Copyright Office policy), does it therefore follow that model-generated code may not be licensed under any terms whatsoever? Meaning anything from MIT to GPLv3?

I recognize no answers here would constitute legal advice, but I would love to hear from legal experts on this.

@mttaggart (IANAL) as I understand it, in the USA no rights may be asserted. It's effectively immediately public domain. And as the USA signed up to the Berne Convention, USA produced AI code (or any AI 'art') can be used by anyone without restriction.

Sidebar: Same is true for US Derived satellite images as it's mechanically generated without a Human to produce it. Can't recall the workaround to that immediately but there is one.

@drs1969 @mttaggart US government documents are not subject to copyright. They are by definition in the public donate.
Licensing vs Copyright: Key Differences Creators Must Know

Understand the distinctions between licensing vs copyright. This article explores how licensing agreements affect copyright ownership and usage rights.

Copyright RPM

@fhekland

Can you license something without owning the copyright?

No, you need to own the copyright or have permission from the copyright holder to license the work.

So if no copyright is possible, no license is possible?

@fhekland Hey @cwebber ā˜ļø this was really bothering me. If the current precedent stands, it's absolutely the case that no open source license is enforceable on generative code, as the copyright is a prerequisite for any license.

I imagine there's a test of amount still, like if most of the code is human-authored, you could still claim copyright. But for example, the tool I just made with Claude Code as an experiment? Full public domain, no terms available to me.

@mttaggart @cwebber This is a really interesting question for all open-source projects. How is "plagiarised" code through an LLM relative to a forked repo? Is the LLM output OK as long as one respects the original licence and give proper credit? How on earth is one supposed to deal with this? Run your code through some plagiarism checker first?
Interesting times we live in..

@mttaggart @fhekland @cwebber This is accurate, yes. Illicitly acquired code works the same way: you don’t hold the copyright, so you don’t have the ability to license it to others.

There is an open question of what happens when the LLM emits a verbatim chunk of code against that code’s license terms. For example, if you told an LLM to implement ZFS’ spa_activate, it’s extremely likely to emit verbatim chunks of CDDL code without the attribution required by the license. A tool can’t be liable for the infringement, but does the liability rest with the company which included CDDL code in the training corpus, or does it rest with the user who didn’t verify that the output doesn’t infringe preexisting copyright?

@bob_zim @mttaggart @fhekland @cwebber Just like with written text on a very obscure subject, the LLMs are liable to spit out the ONLY source for a very specific, narrow technical problem. I have played with this on ChatGPT and the number of times you end up with a mishmash of the two public examples of "how to code X" (which doesn't run) is extremely high, with the same variable names and the same commenting and all. The risk of 100% regurgitation (IMHO) is very high for things that have only been coded and exposed to the world once or twice in the corpus.

@ai6yr Yeah I've had Copilot give me my own Rust code for Windows exploits.

@bob_zim @fhekland @cwebber

@mttaggart @ai6yr @bob_zim @fhekland @cwebber

wow. we've automated mansplaining... shall we call it slopsplaining?

@mttaggart @ai6yr @bob_zim @fhekland @cwebber

An experts guide to Copilot:
1) Do you recognize this as your code?
Yes: go a head and use it
No: Don't use it, it has errors
2) No more steps, we are done here.

@bob_zim @mttaggart @fhekland @cwebber
don't worry. Iff you work for a big corp that has well-heeled corporate lawyers, your employer will be fine.

@mttaggart @fhekland Well, that's right, you can't *license* it, but the public domain is compatible with nearly every FOSS license

The problem is, *not every place has the public domain* and *we don't know that AI generated output will be considered in the public domain everywhere*

This was the motivation that lead to CC0, a public domain declaration with permissive fallback license

We simply *don't know* yet what the legal status of AIgen output is, sufficiently. If it was "public domain worldwide", you'd effectively be mixing its output with yours and contributing that, and it wouldn't likely be tthat big of a deal. For instance, it just might weaken some of the eligibility for coverage under copyleft, but not copyleft compatibility... same with propriettary licenses.

But we *don't know yet*!

@mttaggart @fhekland I read your article btw and thought it was great. I've been meaning to write a response!

@cwebber @fhekland Ah hey, thanks!

And yeah, this question was really to get at the "We don't know," related to your point a while ago about the danger of attempting to license generated code. Basically I wanted more citations on that claim, and it sure seems like the best case scenario is "We don't know," and the worst case scenario is "Almost certainly not licensable." Either way, definitely not safe for us in open source.

@mttaggart @fhekland Ah yeah. I also have been meaning to write a blogpost about the uncertain legal status of LLM based output. I really am worried it's much more uncertain than people are acting...

I think one thing that *is* positive is that I'm glad that the "hey look you can just clean room vibecode a replacement to any open source software" is being applied to leaked software from Anthropic. Now I hope someone does it with a leaked copy of Windows!

@mttaggart @fhekland To put the point there more directly, people feel like they can rewrite whatever from the commons because it's the commons, even though there are license terms attached to that. Well, does that work for proprietary software too?
@cwebber @mttaggart @fhekland Yes, it does.In fact, it may be easier for proprietary software: Since the original software was not part of the AI training corpus, it's easier to prove it wasn't plagiarized. Since Anthropic has been bragging that Claude wrote Claude's code, Claude's code is not copyrightable. Now that it has been made public, it isn't a trade secret either. It is firmly in the public domain, at least in the U.S. Disclaimer: ianal.

@mttaggart @cwebber @fhekland

Nothing is safe for anything or anybody!

And exactly WHO the heck is deciding to enforce WHAT these days, anyhow?

War crimes are fine, human trafficking is cool, but walking around in public not being white can get a person sent to some gulag-- and in the middle of this mess, megacorps making a complete mockery of fair use and what copyriggt shit we may or may not get hit with starts to feel like a circus act with spinning wheels and tossed knives...

@cwebber @mttaggart @fhekland

If you can keep the generated source code secret and keeping it secret provides you economic value you might be able to protect it as a Trade Secret.

https://my.eng.utah.edu/~cs5060/notes/tradesecrets.pdf

@mttaggart @fhekland then again, it was public knowledge for years that lots of Disney works were essentially uncopyrighted because they made mistakes in the copyright line according to the law of the time, but no lawyer was willing to argue that for fear of being stomped on by Disney. If there's a power imbalance not in your favour, don't expect copyright law to ever work for you
@mttaggart (Un)fortunately I'm no lawyer, but it seems odd to me if it was possible to licence something you don't hold the copyrights to.
Moreover, if it originates from an LLM where the training data is a diverse mix of different licenses, how close to the training data can the output be before it's considered derivative work? When seeing how some have almost cloned books in verbatim from LLM's, it would be reasonable to think that this could happen to source code as well.

@mttaggart @fhekland more like: a license is a grant of rights by a copyright holder, but if you don't HAVE rights, there's nothing to be given and such a license declaration is legal nonsense.

It's somewhat akin to declaring an incompatible license that takes you out of the bounds of a dependency. You didn't relicense that dependency because they aren't your rights to grant. The main difference is that *no* such holder exists (leaving aside the "human authorship" matter which is going to be a mess).

(Oblig: IANAL)

@mttaggart That does not sound like an ignorant question. In fact, by asking a question with genuine interest, I think you by deffinition are not ignorant
@mttaggart if you modify the code, you own copyright to your modifications and can license the combined package of generated code and your code.

ā€œ[H]uman authors should be able to claim copyright if they
select, coordinate, and arrange AI-generated material in a creative way. This would provide
protection for the output as a whole (although not the AI-generated material alone).ā€

From pdf page 32 of part 2 of this US report on copyright and AI:
https://www.copyright.gov/policy/artificial-intelligence/
Artificial Intelligence Study | U.S. Copyright Office

Artificial Intelligence Study

@mikix That's super interesting. So a codebase that was generative, then heavily refactored by human hands is eligible. That makes sense! But anything that is mostly generative would still seem to lack copyright protections.
@mikix @mttaggart thanks so much for the citation! this has been my supposition but it's good to have something official-ish to point at.
@glyph @mikix Important to note that this document comprises mostly comments on policy, not policy itself. But the established norm of human rearrangement would stand.
@mttaggart @mikix indeed. on policy *itself* I think we are still kind of in the dark. but copyright law in general is far more of a mess than most engineers believe anyway

@glyph @mttaggart @mikix There was this recent case where the human wasn't able to get copyright on AI generated artwork. We won't know until something actually goes before the Supreme Court.

https://www.reuters.com/legal/government/us-supreme-court-declines-hear-dispute-over-copyrights-ai-generated-material-2026-03-02/

@jameshubbard @mttaggart @mikix refusing to grant cert *is* a supreme court decision, though
@glyph @mttaggart @mikix yes but in this case it was more of a piece of art not code. If SCOTUS refuses to hear a case about code that goes similarly, we'll have an answer at least temporarily.
@mttaggart @glyph @mikix It may be useful to consider music licensing. Rearrangements, remixes, and records which sample other records are considered derivative works of the original record. Covers and parodies are considered unique records derived to varying extents from the original songwriting.
@bob_zim @mttaggart @glyph @mikix
I didn't pirate Windows, it's a parody!
@mttaggart there likely are still rights held on it by the humans whose work was used as "training material"

@mttaggart

> if model-generated code may not be copyrighted due to a requirement of human authorship ... does it therefore follow that model-generated code may not be licensed

Someone may *assert* a licence, but may lack a basis on which to enforce it. They should struggle to uphold a claim for infringement of copyright, if there is no copyright to infringe.

If it is not a licence, but a contract, then it becomes more interesting.

@mttaggart This might be a case of "License what you want, but you won't be able to enforce it"
@mttaggart LegalBot will respond to you shortly