After my repeated posts / boosts arguing that in OSS we’ve overemphasized licenses and underemphasized community, governance, and sustainability…I actually have a license question:

What’s the current thinking on licenses that lay the legal groundwork for action against people using OSS source code for LLM training without seeking permission or offering compensation?

1/2

The obvious answer is copyleft-type licenses.

(1) Has anybody done legal analysis on that beyond the obvious? I don’t think LLM training on copyleft code has been tested in court yet…? (Even LLM training on more restrictively licensed works seems to be surviving court challenge….)

(2) Are there copyleft licenses (i.e. “derived works must be similarly licensed”) out there that don’t have the Stink of Stallman on them? Or is GPL v3 still just the way to go despite the smell?

2/2

@inthehands the CAL is OSI approved. Like the AGPL, it requires providing source to network users but it also requires providing the data necessary for a user to recreate the environment locally. In other words, a hosted can't hold the users data hostage to prevent them from migrating to another instance.

It is,of course, not GPL compatible but whether that matters depends on your use case.

https://opensource.org/license/CAL-1.0

Cryptographic Autonomy License - Open Source Initiative

1. Purpose This License gives You unlimited permission to use and modify the software to which it applies (the “Work”), either as-is or in modified form, for Your private purposes,...

Open Source Initiative
@inthehands I stand corrected: it may be GPL compatible if you use the "combined work exception". I think, IANAL