Honestly, the thing that will probably kill LLMs the hardest is someone writing a small language model that fits in JavaScript in a browser and hits comparable benchmarks.

Why bother with all those GPUs and energy usage if your Raspberri Pi could get comparable results?

@soatok apparently you can even run an LLM in a font fuglede.github.io/llama.ttf/

(requires HarfBuzz+WASM; haven't tried it myself)
llama.ttf

llama.ttf is a font file which is also a large language model and an inference engine for that model.