Dear @creativecommons ,

I read your article about your initiative for new licenses for dataset holders in the AI industry.

Let’s be clear: I do not want to re-license my hundreds of CC-By comic pages to please AI giants.

I wish you would support CC artists suffering from massive plagiarism. You should enforce your own existing licenses against AI mass crawling. It seems you’ve joined the battle only after the casualties and still managed to side with the wrong people.

https://creativecommons.org/2025/06/25/introducing-cc-signals-a-new-social-contract-for-the-age-of-ai/

Introducing CC Signals: A New Social Contract for the Age of AI - Creative Commons

CC Signals © 2025 by Creative Commons is licensed under CC BY 4.0 Creative Commons (CC) today announces the public kickoff of the CC signals project, a new preference signals framework designed to increase reciprocity and sustain a creative commons in the age of AI. The development of CC signals represents a major step forward…

Creative Commons
@davidrevoy @creativecommons What a shame, creative commons. What would Aaron think about this?
@davidrevoy @creativecommons whenever a company works with AI instead of rejecting it an angel loses its wings
@lamb @davidrevoy @creativecommons There's an executive trying to find new ways to monetize amputated angel wings.
@davidrevoy @creativecommons I don't get what this is even supposed to accomplish. AI companies don't respect existing licenses or copyrights, why would they respect usage preferences via these signals? There's no point of this approach when scrapers can just continue to hide behind the "fair use" argument and steal everything they want.

@drikanis @davidrevoy @creativecommons

Yeah, what we need is a way to prove that a model was trained against an author's work

@alxlg @drikanis @creativecommons https://haveibeentrained.com does a pretty good work at listing the images in the huge popular datasets. Got 11 pages for "pepper carrot" keywords... 😔
@davidrevoy @alxlg @drikanis @creativecommons This exploitation of artists from AI corporations makes me so angry 🤬
@davidrevoy @creativecommons i might misunderstand this: Isn't this more about creating a way to express an artists terms towards using their work for ai? Meaning just like i can select a CC with -BY- i should be able to select something like -NOAI- or only non commercial AI.... give more control and make it very clear what's allowed and what isn't? I would like that... Making sure my work is open for humans and not for ai. I would also imagine such an ai specific license would make legal battles a lot easier... just some thoughts.

@ms_zwiebel @creativecommons Unfortunately, there is no mention of a 'no AI' rule... quite the opposite, in fact. You can read the four drafts by following the links here: https://github.com/creativecommons/cc-signals?tab=readme-ov-file#cc-signals-1

It's just an attempt to reinvent CC-By, but written 'for AI', so it's modern instead of enforcing their own legacy CC-By. There's also the promise of breadcrumbs for artists in the form of financial contributions, but it's the AI giants who decide who gets how much in good faith, lol. It cannot work.

GitHub - creativecommons/cc-signals: CC signals is a framework for a simple pact between those stewarding data, and those reusing it for AI development. CC signals provide a set of shared ground rules for an AI ecosystem that is mutually beneficial.

CC signals is a framework for a simple pact between those stewarding data, and those reusing it for AI development. CC signals provide a set of shared ground rules for an AI ecosystem that is mutua...

GitHub
@davidrevoy @creativecommons ahhh i only read the post from them not the drafts, sorry. If there is no special options to disallow or set rules for AI use i agree... i will check out the draft to get an idea...
@ms_zwiebel @creativecommons Yes, I'm curious to see how this draft will evolve. Hopefully they'll realise that a NOAI tag would be a welcome addition to the BY/SA/NC/ND family. But we can let them know this, can't we? 🙂

@davidrevoy @ms_zwiebel @creativecommons My understanding is `ai=n` would mean to disallow any use by AI, no?

That is, the IETF AI preference standard already gives you the way to do a blanket yes/no-AI statement. The CC signals thing is just to then add fine-grained exceptions in case the use is contributed back.

That seems fine to me. I would still just set `ai=n`, but I can't see the harm in allowing people to opt-in to use under specific conditions.

@Merovius @davidrevoy @ms_zwiebel @creativecommons so I read all this in light of a previous commentary they wrote

They suggest that copyright cannot prevent "AI" training, because in the US it does (and morally should) fall under fair use. Suing outputs & setting *norms* is the best approach:

https://creativecommons.org/2023/02/17/fair-use-training-generative-ai/

And I agree: a "copy right" you can reserve called "others may run a computer algorithm on this work" I consider the end of our digital age and the Copyright Maximalist wet dream

Fair Use: Training Generative AI - Creative Commons

While generative AI as a tool for artistic expression isn’t truly new — AI has been used to create art since at least the 1970s and the art auction house Christie’s sold its first piece of AI artwork in 2018 — the past year launched this exciting and disruptive technology into public awareness.  With incredible…

Creative Commons

@davidrevoy @ms_zwiebel @creativecommons

Unfortunately, there is no mention of a 'no AI' rule...

well… there are such mentions, if I understood correctly. though ai restrictions apply not in license/signal itself, but in headers (robots.txt and HTTP headers). I mean, ai=n and genai=n in Content-Usage:. but I agree that headers do not have any legal effect, so can be ignored (as a matter of fact now).

@ms_zwiebel In addition to what @davidrevoy said, NOAI is a problem because it's an opt-out: Anyone not using it is presumed opted-in.

Accepting a NOAI tag means conceding that literally everything that *doesn't* have the tag is fair game for the scrapers. And it's not fair to artists, especially ones with huge bodies of work, to make them go and tag each and every work with a new NO[whatever] every time a new exploitative technology comes out.

@ms_zwiebel @davidrevoy To put it a bit more pithily: It's still assault even if they weren't wearing a "don't punch me" tag.

@Linebyline @ms_zwiebel @davidrevoy There is some flaw in your reasoning. Not opting-out does not mean you opted-in. That would be very weird.

Normally opt-out is really shitty since companies love to to assume you consented to something. Bad practice in software in general, also why for example the EU stepped in with GDPR. To make clear that “no assuming consent is not good enough or allowed” (within reason there is a lot to the GDPR so please excuse this gros simplification here)

Copyright works on a different basis. It is opt-out by default but here it works in favor of creatives. It is generally assumed in copyright that unless stated otherwise you keep all the right to your creation to yourself and no one is allowed to use it in any capacity* without your explicit consent (the famous “all rights reserved”)

Now copyright is a highly legislated field but I am pretty sure this general assumption is still the norm to this day.

There is obvs. the big elephant in the room here called “ fair use”. Its a whole can of worms which is very complicated tbh. The general terms would be this (in the US):

“the fair use of a copyrighted work, […] for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright.”

But if AI meets any of these critieria is a legislative question. Also AI-Use will involve IP-law as well.

So no I don’t think the addition of an “No AI” would make it a free-for-all to use any images that don’t have the tag for AI. Copyright itself is so much opt-out its the reason why creative-commons even exists.

without CC I would need to assume that every image I see on the internet is considered “all rights reserved” and if I can’t claim fair use I would need to message the artist and ask for permission first. Gets tiring fast.

Creative Commons is a legal framework where artists can license a broad set conditions to x-people with a common license.

So generally I think it is a good Idea that CC adds more signals to its system just so its easier for artists to express their will in a legally binding and legally-approved way. There shall be discussion on what that entails.

Also there is the technical aspect. How do I tell AI this. Its not reading the webpage like a human would looking for license info. So you would need to find a way to tag the image with say CC-NOAI in machine readable form and make it in a way its very parasitic and cannot be stripped from the image without making it useless. I would then still scrape all images but toss-out these that have such tags. But that assumes no ill-intent on the side of the creator of the AI. Which was shown we cannot trust on. But how to enforce this.

CC cannot really help you out here since its the individuals matter to protect their copyright in court. CC could only maybe collect such cases and make some sort of combined legal case…

@Linebyline @davidrevoy @ms_zwiebel now what I cannot tell you is if such a explicit ban of the use of AI is stronger than fair use.

I would think no since not even the most restrictive default-state of "all rights reserved" cannot do that.

I might be wrong here, I only do tax-law for a living not IP-law xD. So you are welcome to educate me.

@stefan @davidrevoy @Linebyline @ms_zwiebel "How do I tell AI this. Its not reading the webpage like a human would looking for license info."

Okay now can we take a step back and look what happened here. This is a complete backwards view of responsibility and a massive power shift. Why should the artist be responsible for making AI behave according to law? If the AI is not reading the license, it has to be illegal, period. It cannot legally use any material that it can't determine the license of.

This is the fight that artists and copyright holders need to fight. If AI training is deemed fair use, then all licensing ceases to matter. If it exists, it can and will be scraped. If you don't make it your hill to die on that AI is responsible for respecting licenses, then the battle is lost.

@stefan @davidrevoy @Linebyline @ms_zwiebel This is a matter of legal principle too. The same thing cannot be illegal if I do it once but legal if I do it a billion times with a computer.
@skaphle (I will just untag some people here if thats okay. Not really wanting to spam their mentions if not needed)

I am with you that it is illegal to just take images from somewhere and putting into a dataset is (in most cases) illegal. As I outlined with the general opt-out nature of copyright. That is until a court decides otherwise.

Enforcing it is obviously is the problem. Disney and Universal are fighting OpenAI in court over this issue right now. Will be an interesing case to follow.

Scraping in and of itself can be still fair use btw. Its more a matter of how you use the data. Doing this for scientific for the research of AI in a scientific setting might be deemed legal (i.e because scientific is considered fair use broadly)

If openAI can clear that mark I dunno. But likely not since they are using stuff commercially not just to research behaviour of AI and stuff. But thats ultimately the job of courts to decide.

As I said as well i dunno if an explicit prohibition in a license can beat fair use or not. If yes the NoAI-tag might proof useful as it makes for a clear against thr alternative of fighting over fair use or not. Making that tag machine-readable / embedded also makes sure you cannot claim.you did not know as an AICompany.

But as I said I dunno if thats correct or not. If fair use beats that and OpenAI somehow gets FairUse through we are fucked anyway

@skaphle Right!? Of course we've been over this already with patent law: Doing something that's obviously unpatentable with a computer doesn't suddenly make it patentable. and yet, there are so many patents for obviously unpatentable things just because they use a computer.

Sadly, the way the whole system works, in both cases, depends on someone being able to sue. So in practice whoever has the most money to spend on lawyers is usually "right." :P

@stefan Not sure what you mean. Maybe the opt-in/opt-out language is tripping us up, so let's call it default-allow/default-deny.

Under a theoretical US copyright law that wasn't weighted in favor of whoever had the most expensive lawyers, copyright is a default-deny system. Copyright reserves certain rights to (usually, initially) creators, and those rights are *denied* to everyone else unless the copyright holder explicitly *allows* it. (1/2)

@stefan NOAI flips that on its head. If we accept that AI scrapers can use anything that doesn't have NOAI, that means copyright is now default-allow for any use that has a NO[whatever] tag.

And the existence of NOAI implies that. If I use NOAI only for new work, scrapers can argue that's implicit permission because why would I say no to some if I mean no to all? And if it becomes a standard, they can argue that anyone not using the standard deny implicitly want to allow. (2/2)

@Linebyline Its not how copyright works (Part two of.the post). As you say it is default-deny in its core. This does not change with a Tag that explicitly states NO[Whatever]. You ALWAYS retain the rights to your UNLESS you explicitly allow usage. Thats ultimately the spirit of the law.

The addage of a line that says "No one can use this for anything, but fuck AI in general" does not change that nor makes it anything default-allow. You cannot assume that in copyright. Never. When you are in doubt over if you can use something or not. The answer will lean to "no". Everything else will end very badly for you.

Technically even Public Domain is only a license and you retain some form of copyright. you just grant a very very permissive license.

You actually got it right with the first part.

I know I am generalizing here but its the spirit of the law, pretty certainly. I did not get into exceptions like private-copies or fair use here.

@stefan Public Domain is not a license. It is the absence of copyright. Were you thinking of CC0?

NOAI is not "fuck AI in general," it's "I am explicitly not giving AI permission to use this work."

You're ignoring the concept of implied licenses. The specifics vary by jurisdiction even within the US but if AI bros can convince a court that my conduct (e.g. by marking some other works NOAI but not *this* one) implies a license, I might not be protected. (3/2)

@stefan Moreover, if NOAI becomes a de facto industry standard, then if I never use it, the AI companies could easily persuade a judge that my participation in a space where AI scraping permission is normally expressly denied, but not denying it, I am implicitly granting that permission.

This isn't about what the statute says. It's about precedent. (As you know, statutes don't mean what they say. They mean what the courts *say* they mean.) (4/2)

@Linebyline Tbh still seems hella backwards for me. But US copyright is broken to hell and back anyway. Implying copyright of a work being being something else than the default-deny would be a big no-no. Companies like disney would fight tooth and nail against such a ruling pretty sure.

Good I am not in the US but EU  Its atleast a bit less shitty here.

But anyway I wish you a nice day. I will step back from this discussion as I have nothing more to say

@stefan Agreed, I think if we go further we're just going to be shouting past each other.

Just, do me a favor: Pester your politicians to keep an eye on us and not repeat our mistakes.

Have a good one!

@Linebyline @ms_zwiebel @davidrevoy especially since they will ignore it and simply scrape the work anyway.
@Linebyline @ms_zwiebel I agree completely. The ideal would be to not have to opt out at all. If CC had a spine, it should be built-in every CC license by default.
@davidrevoy @Linebyline @ms_zwiebel copyright law as it currently exists already forbids AI scraping and training by default until/unless a big enough lawsuit changes that. it shouldn't have to be built into each license because thanks to copyright, it should already be default regardless of the CC license choice. if CC were to go forward with any proposal that makes AI scraping an opt-out decision rather than opt-in, then creative commons licenses will all suddenly completely undermine a useful part of copyright

I never thought that I'd be speaking favorably about any part of current copyright law, but here we are
@Linebyline @davidrevoy uhhhhhh, i did not think about this...so we need AI tag instead.
@ms_zwiebel @davidrevoy Exactly! The point of a license is to show what you *are* willing to share, with all other rights being reserved.
@ms_zwiebel @davidrevoy @creativecommons unless the concept of NOAI is the default, and the crawlers respect it (a very big *if*) this is just appeasement

@Offbeatmammal @ms_zwiebel @davidrevoy @creativecommons

It also requires a broad law of audit for data sets that publishes the identity of all sources. If there's no explicit licence then the owners must be paid for the existing use and the owner can require the removal of that source.

@Offbeatmammal @ms_zwiebel @davidrevoy @creativecommons crawlers respect nothing. the only thing we can do is honeypot bomb them.

@davidrevoy

I'd love to pick your brains about this particular news. From what I can read it looks so vague that it's no so much introducing something as inducing anxiety.

I have no reaction because it doesn't appear to have anything in it. Perhaps out of sheer cowardice. It's hard to gauge anything other the a will to "do something" or be seen to be doing something at least.

Interested?

@doctormo Sure! But I think the four links on their Github https://github.com/creativecommons/cc-signals?tab=readme-ov-file#cc-signals-1 with mini draft 'key idea' for the four new licenses speaks already a lot about the intent, the philosophy behind it.
- Tailoring an attribution licence for AI giants in case they change their minds and decide to respect something.
- Believing in the royalties breadcrumb system, whereby the AI lord decides in good faith (lol, their words) who receives the financial micro-percentage income from reuse. yay.
GitHub - creativecommons/cc-signals: CC signals is a framework for a simple pact between those stewarding data, and those reusing it for AI development. CC signals provide a set of shared ground rules for an AI ecosystem that is mutually beneficial.

CC signals is a framework for a simple pact between those stewarding data, and those reusing it for AI development. CC signals provide a set of shared ground rules for an AI ecosystem that is mutua...

GitHub

@davidrevoy

Heh, I notice one of the signals isn't "fuck off", but I guess that's just something we'll have to win in the courts.

Looks like they've come at this from the idea of the maximally restrictive copyright regime where property rights had made sharing problematic to one where the law is allowing unconsenting abuse of works. Maybe hedging that copyright will be tightened.

Though considering the gullibility of politicians with AI, I just don't see that yet.

@davidrevoy you can use this NOAI license until someone makes a better one 😇

https://eobet.com/fun/noai/

BY-NOAI-NC-NFC-NM-SA

@eobet @davidrevoy

"NonFuckingCorporate means that Licensed Material is made for people and not greedy corporations. If You work for a large, international corporation, make Your own shit instead as the work You do only serves to reduce the commons.
NonMilitary means if You are in any way associated with state organized violence, make your own shit instead as You are most likely only protecting capital and furthering fascist or colonialist agendas, which only serves to reduce the commons."

🔥

@davidrevoy I would be happy if they just said: there were always clear rules. cc by can be used by machines, but then every single creator whose creations got used must be attributed properly.

Most of my creations are under cc by-sa. If a machine builds something on top of my creations, the rule *should* be clear: keep the license. If it reads cc by-sa, it must be cc by-sa (or a compatible license like GPLv3).
1/2
@creativecommons

@davidrevoy
Alternatively let’s apply the same to humans: allow me to build on Star Wars for my own art, then I’ll gladly allow machines similar with my works.

But doing that requires new international "Intellectual Property" treaties.

2/2
@creativecommons

@ArneBab @creativecommons Very true. I wish they could help creators and huge CC content providers enforce CC licensing in the era of AI. That's all. CC-By-Sa and Wikipedia are good examples of this. CC should fight to make all prompt output CC-BY-SA if they used Wikipedia.

Also, what I wish for is a clear NOAI tag that artists could use, (eg. CC-BY-NOAI )

This would complement the BY/SA/NC/ND toolset and make it easier to enforce the licence if the artwork were to be found in any dataset.

@davidrevoy @ArneBab @creativecommons I guess NC and ND already forbid most of the AI use. They just don’t care about it…

@breizh you don’t need NC for that.

AI’s don’t attribute properly.

The only CC license that is lax enough that creations it applies to can be used without attribution is CC0.

But the trick around that is to say that model training does not actually re-use works, so copyright does not apply.

@davidrevoy @creativecommons

@davidrevoy @ArneBab @creativecommons
At first I thought that NOAI felt wrong as an opt-out. But of course the CC licenses are based on free use, and any additional tag is a restriction of these freedoms.
But I'll have to look into exactly how BY is defined - whether it might be enough for the tech bros to link to a gigantic list of all CC artists.

Of course, the main problem for really effective protection is always proving that something has been used in the single case, anyway...

@gorobar The NOAI can still be useful -- because EU rules specify that artists can opt out from model training done for commercial purposes. This would be that opt out.

That’s via article 3 and 4 of the European Copyright Directive, see for example https://copyrightblog.kluweriplaw.com/2019/07/24/the-new-copyright-directive-text-and-data-mining-articles-3-and-4/

By going commercial, OpenAI bound itself to that limitation.

@davidrevoy @creativecommons

The New Copyright Directive: Text and Data Mining (Articles 3 and 4) - Kluwer Copyright Blog

Art. 2(2) of the DSM Directive defines ‘text and data mining’ as “any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations”. Text and data mining (TDM) generally refers to the computer-based analysis of large bodies... Continue reading

Kluwer Copyright Blog

@ArneBab @gorobar @davidrevoy @creativecommons "In other words, Art. 4 right holders may effectively prohibit text and data mining for commercial uses by adding robot.txt type metadata to their content online."

We know that robots.txt is currently ignored all the time because AI companies canceled that social contract

@skaphle It is ignored -- but that makes them liable for copyright infringement in the EU because they cannot claim the copyright exception from the directive.

That this isn’t a problem of laws, but a problem of legal enforcement.

Someone has to sue the biggest players in a 400 billion venture capital dollar / year industry. And win …
@gorobar @davidrevoy @creativecommons

@davidrevoy @ArneBab @creativecommons While you're at it, make the attribution limited to only those sources used to produce a given output, not all the sources used to train it. That is pretty much impossible to do, of course, so should practically kill the whole thing off.

@StarkRG they *could* retain which input files affected which neurons and then reconstruct the main inputs by mapping on the activation patterns of the neurons.

There’s already research in that direction.
@davidrevoy @creativecommons

@ArneBab @davidrevoy Well, that's certainly interesting. Seems like a whole lot more data to store and retrieve, though. Unfortunately, I suspect that extra cost probably wouldn't be enough to kill the industry.
@davidrevoy @ArneBab while that sounds reasonable at first glance, it is questionable whether the licenses as they are now are even able to exclude using works that are CC licensed under current copyright legislation. CC has said so a number of times in posts spanning the last four years.
Any “NOAI” license would furthermore not only apply to TDM for genAI training, but probably also for mechanisms that are meant to build, e.g. knowledge graphs based on the source material.
@davidrevoy @ArneBab @creativecommons isn't output from AI actually public domain? afaik there's not copyright on machine output. so, no CC can be enforced in this sense (SA)