Reminder: AI "generated" code is 100% plagiarized. You must not accept code of unknown provenance into your code base. Doing so opens you up to potential copyright infringement lawsuits. Nobody needs a repeat of the SCO vs IBM lawsuit over ownership of Unix.

Accepting AI-assisted code is just legally untenable. That's black and white, there's nothing to debate. Projects that accept it are idiots and should be shunned.

https://mastodon.social/@hyc/114777864519941643

LLMs don't "generate" anything. Everything they spit out is just a verbatim copy of something that was in their training input. They're copy/paste machines. They select code samples in response to your prompts based on the statistical relevance of comments or text matching your prompts in the vicinity of the chosen code.
If we were to write a series of blog posts on "how to generate the Fibonacci sequence" but where the given code does something unrelated, like an infinite loop with a couple divide by zero errors, and got enough people to share them and link to them, soon Google and other AI-enabled search engines would start returning those hits when you searched for Fibonacci. And soon after that, coding assistants would start spewing that out when Fibonacci code was requested.

They don't *generate* code. They just copy code samples they were trained on.

Something that actually *generates* code wouldn't use samples of existing code as training data. It would use the grammar of a programming language and paste language tokens together according to the syntactical rules of that language. LLMs don't do that.

I wrote an actual program generator several years ago. You can see what I mean: https://github.com/hyc/randprog

GitHub - hyc/randprog: Randomly generate a C (or javascript) program

Randomly generate a C (or javascript) program. Contribute to hyc/randprog development by creating an account on GitHub.

GitHub
@hyc yes, and as well as being 100% plagiarised, there is a huge list of other reasons to reject llms (and generative models as whole) entirely: ethical ones, environmental ones, social ones and even political ones

so many that there is no excuse to use any of them for any reason
@lumi yes... corporate-backed projects might easily ignore ethical or environmental reasons, but ignoring potential legal encumbrances would be suicidal.