today's (unreleased, likely ~spring 2024) legal opinion article: AI models are likely uncopyrightable (no human hand) and too easy to reverse-engineer for trade secrets to work. So the VC billions are going into stuff they can't own. Which is the funniest possible result.
@davidgerard Sorry, I think you are confusing the available to download models, like on HuggingFace, where your argument has some value with the proprietary and not downloadable models like GPT 4. Good luck trying to reverse engineer GPT 4 from typing in prompts and recording responses. The valuable intellectual property models don’t depend on copyright law to retain their value, they depend on secrecy and security - like the recipe for Coke.
@JonC the dude writing this is pretty sure it's laughably doable, and the closest they have to a moat is access to lots of Azure GPUs
@davidgerard @JonC I was wondering this as well, since one of the characteristics of these bots is that their path from to A to Z is so inscrutable that even the creator usually doesn’t understand how it got there (without painstaking research). But I suppose that’s not the coded part (thus why it’s inscrutable) and the code holding it together is maybe not so complicated.
@JonC @davidgerard Look for papers on “model leeching”. Copying a model by prompting it millions of times is a thing. It’s a fun question: if two models return the same answer for every prompt (or 90-plus percent of them), are they copies of each other? Even if their internal weights and vocabs are totally different?
@gclef @davidgerard There is a big difference between using a set of prompts and responses (ala https://arxiv.org/pdf/2309.10544.pdf) to train a student model and having a reverse engineered copy of a model like GPT.