Further, there are several large open questions about copyright in relation to agents. Is their training data truly unencumbered by copyright and licensing restrictions? Is their output copyrightable? Does even a "trivial" contribution by an agent invalidate the copyright of a larger code contribution that contains it? Is the user of a coding agent, or the user of the software generated by that agent, indemnified if the resulting code is later found to infringe a copyright or violate a license? Can an agent truly be said to output a "clean room implementation" of something when there is a non-zero chance that its training data contained the thing being reimplemented, and there is no way to verify that?
So, in general I'm against coding agents on moral grounds, and I'm also against them on legal grounds because I think any risk at all is too much risk. But on the other hand I'm intrigued by the question of "trivial" contributions, and I suspect that even projects that don't allow assistance from AI coding agents may have unwittingly accepted code that contained such "trivial" contributions. My questions are:
1. Is it possible for an AI-assisted code contribution to be "trivial" enough that it presents no legal risk, either now or in the future?
2. If so, how would you go about determining what's "trivial" and what's "significant?"
3. How could a contributor not just self-certify, but present verifiable evidence that a code contribution was legal and non-infringing and that any contribution from an agent met the "trivial" standard?
4. How could a company or open source project protect itself against a dishonest or bad faith actor who contributes code that later is found to infringe on a copyright or violate a license?
5. Who's going to pay for the damage if the worst case scenario comes to pass?
I don't have answers, but I suspect that the question of what constitutes a "trivial" contribution is going to matter a lot in the future.
