Honestly, the thing that will probably kill LLMs the hardest is someone writing a small language model that fits in JavaScript in a browser and hits comparable benchmarks.

Why bother with all those GPUs and energy usage if your Raspberri Pi could get comparable results?

Is this possible? I dunno. I'm not specialized in this.

But if I wanted to fuck the GenAI bubble over and had the relevant background experience? This is what I'd explore.

@soatok If you want it just to be able to use language, sure. But they want a vastly overfitted model that lossily compresses the volume of human writing and can spit back out obfuscated plagiarism of arbitrary parts.

@dalias One model per language.

Want it to generate C? Download the C model.

Want it to write bad poetry? Download the Vogon I mean English model.

@soatok Right but that's not all they want. They want it to generate obfuscated plagiarism of poetry. They want it to generate "copyright-free" copies of arbitrary FOSS programs, songs, etc. This inherently requires the largeness of the model because the plagiarism is buried in the overfitting.
@soatok If you had to give it the things you wanted copied as explicit input, the plagiarism and copyright infringement would be obvious to users and courts. Making it gigantic ambient state obfuscated in the model is how they get away with it.

@dalias @soatok we agree that this is a thing these companies want, in the present day, now that they've seen the potential for theft-at-scale

we don't think it's the line of reasoning that brought us here

@dalias @soatok we think the original motivation was the usual large-company thing of starting from the conclusion they want to be true, then pretending like it is.

it would have gone like this: for large companies to dominate this market, there has to be something they can do that small companies can't. what is that? spend more money on training it.

@dalias @soatok our main reason for thinking about this is that our friends at DAIR who were part of Google's ML Fairness team have spoken publicly about the company's (lack of) reasoning for increasing model sizes