@simon
Pretty much all of that does not hold true in my opinion:
DeepSeek v3, still one of the strongest models - trained for just $6m, way less than the biggest USA models
That's money, not power or resources. Any monetary cost claimed by a Chinese company can't be compared to actual free countries as a way larger part of that cost is offset by reduced environmental and worker protection (and partial slavery), so totalitarian in general. Also they allegedly based their work on OpenAI so you might have to add their costs on top too...
Does not require expensive rare materials to run: can't help with that if it rules out laptops
It's about their requirement of specialized hardware to train (while the models might run on "normal" CPU nowadays, they cannot be trained on a cheap phone or laptop. A normal program can be created there no problem.
Can actually run locally in a private environment: yes! We have that now. The models I can run on my laptop got really good starting from about six months ago
They really aren't. They are slow and can't handle actual reasoning or even remembering things like the "big" models can. (And of course they are anything but intelligent.)
But it's all just a word prediction system anyways I guess it now just predicts more words at a basically cost linear to how much words you input and want to have predicted, so with the current approach a local machine will always be behind what one can do in a huge data-center hence why I want a different approach that aren't just llm and inference-based.
Is free (as in libre) to be used by everyone: we are there too. The Qwen models are under an Apache 2 open source license. Plenty of other good models are "open weights" which is almost good enough to allow "free to be used by everyone"
All they release is the finished model (and in the case of Qwen their weights) which of course is nice, but it does not allow reproducing or even forking their work. As long as they do not release the code which made the model and the training data under a free license it imo. cannot be considered free.
Them licensing it under any kind of free software license might actually not be valid as it's not based on work that was available under a free license. I would even go as far and say that most models are released illegally as they are derivatives of copyrighted works. Them sticking a free software license on it does not magically safe them from the copyright the material they used is under.
A good comparison in the classical software development work is the CraftBukkit project which used Mojang's Copyrighted Minecraft code and got taken down not by Mojang/Microsoft but by a contributor because their approach violated the GPLv3 license, most "open" models run into the same issue