I suppose we're not supposed to ask where the information in these proprietary machine learning models came from.

https://cloud.google.com/blog/topics/threat-intelligence/distillation-experimentation-integration-ai-adversarial-use

@mhoye > IP theft

"We're all trying to find the guy who did this"

It's increasingly hard to come up with satire for this fucking industry

@mhoye crazy to read that such a thing classifies as “intellectual property theft.”
@mhoye oh no, IP theft! 😂
@mhoye I’m thinking about the proverbial parricide who pleads for mercy as an orphan.

@mhoye

Let me explain.

There’s nothing to explain. You’re trying to kidnap what I’ve rightfully stolen.

@mhoye "IP theft" 🤣 🤣 🤣 🤣 🤣 🤣 🤣
@mhoye here's our chatbot, ask it anything!
no, don't ask that kind of question! you're commiting IP theft!

@mhoye they’re giving the game away early. their desired state of things is that all of the open data sources used to train their LLMs have been scraped to death, polluted with LLM slop to the point of uselessness, or bought and made proprietary. only their degraded derivative will remain.

this is also one of the mechanisms that will be used to ensure that supposedly open source LLMs will not be a threat. all of them are derived from proprietary models, even the ones that claim not to be.

@mhoye (this isn’t to imply that open source models would otherwise be good; they’re even more useless than the proprietary ones, which is a bar so low it’s on the floor. but in addition to all the other problems, LLMs are fundamentally not something you can train without a big corporation’s ability to steal data en masse and get away with it. it’s a system designed to privilege gigantic corporations at all levels.)
@mhoye man fuck these ghouls for reclassifying a training technique as an attack, don't let them turn this into a cybersecurity "issue" while they scrape and steal the entire web in the background

I don't want anyone fighting to protect them from dumb shit like this, collectively the infosec professionals have more important stuff to do than help stop AI models training from each other

Ask them about all that research on model collapse and why it suddenly doesn't apply anymore in this case instead, they'll have to explain that they're lying about one or the other lol

@mhoye You know, I have mixed opinions on the ethics of training LLM and image synthesis models.

I have zero mixed opinion about doing anything you can at the IO of the model to extract the trained data. Absolutely none. No compunctions whatsoever.

@mhoye
honestly not sure i buy "intellectual property theft" in this context.
@mhoye I'm unsure if this is an "It's mine, I've stolen it fair and square" situation or rather a "I thought there was honor among thieves" kind of one.
@gabrielesvelto I'm inclined towards "there are no referees in a knife fight."
@mhoye Or, rather, MEA represents a way to uncover what was stolen to create a model -- this activity effectively represents a way to catch thieves.
@Retreival9096 @mhoye the interesting thing is that the collection of data can be IP even when none of the contents are their own copyright.
@Retreival9096 @mhoye not that IP shouldn't be abolished, but I'm in the property-itself-should-be-abolished camp. If anything should be decided between, if rationality may be relieved of pretense, property is circular logic. Childish, even. But I do understand it. Divisions for safety. Your side of the bedroom and mine.

We can't always be expected to get along, and if we have to fight over something, it might as well be about where we have the right not to be fought. The word "mine" is simple enough to be quickly apprehended, if backed by laws enforced by the most powerful among us, their absurd hoard and inherent disregard playing a role in the everyday life when we need a fence to be respected. Reference their guns, and I don't need guns to have peace.

Hope for more would be idealistic. Let's not be silly.

Let's also hope they don't abuse their guns, since we don't have any. Oops that happened, but let's really hope they don't do worse. Well if they do worse, at least they do it way over there. Let's not think about how much of our 8 billion lives is even news, vs how many guns are pointed at how many heads. Let's at least try not to make the news, ourselves. Oh maybe we don't even need the news. Look at this guy on social media ranting about property.
@mhoye after a few recursions of theft of robbed things it is just "common knowledge" according to Anthropic