Honestly, the thing that will probably kill LLMs the hardest is someone writing a small language model that fits in JavaScript in a browser and hits comparable benchmarks.

Why bother with all those GPUs and energy usage if your Raspberri Pi could get comparable results?

Is this possible? I dunno. I'm not specialized in this.

But if I wanted to fuck the GenAI bubble over and had the relevant background experience? This is what I'd explore.

@soatok If you want it just to be able to use language, sure. But they want a vastly overfitted model that lossily compresses the volume of human writing and can spit back out obfuscated plagiarism of arbitrary parts.

@dalias One model per language.

Want it to generate C? Download the C model.

Want it to write bad poetry? Download the Vogon I mean English model.

@soatok Right but that's not all they want. They want it to generate obfuscated plagiarism of poetry. They want it to generate "copyright-free" copies of arbitrary FOSS programs, songs, etc. This inherently requires the largeness of the model because the plagiarism is buried in the overfitting.
@lritter @soatok "They" being the AI enthusiasts or people who feel like they're getting something of value from "AI". They almost surely don't frame what they want to themselves or others in terms of the plagiarism, but it'd be useless to them without that outcome.

@dalias @soatok i don't mean to be the bearer of bad news but if these systems would only plagiarize it would certainly be easier to dismiss em. they also inter- and extrapolate, which, if a human had done it, would be counted as original work. of course that's not all that creative work is, and so it's quite limited.

overfitting is undesirable, because it turns a fuzzy database into a regular database - then it's not plagiarism, it's a straight up copyright violation.

@lritter @soatok Um, no, you're not being the "bearer of bad news". AI propagandism isn't news. An interpolation of existing works to cover up that it's essentially the same as someone else's work is plagiarism if a human does it too. This is why the early architects of free software always insisted on clean room reimplementations based on a specification someone else had worked out, not reading reverse engineered or leaked proprietary code then pretending they could forget it and write something equivalent.
@dalias @soatok i see you have your mind made up. but what i said isn't AI propagandism - just facts. i find the implication insulting, tbh. but you're in war mode, i get it. happy hunting.