Gelezen: Seveneves #blog
| technology | https://www.colada.be |
| personal | https://www.netsensei.be |
| technology | https://www.colada.be |
| personal | https://www.netsensei.be |
Gelezen: Seveneves #blog
That still leaves me with a problem, though. These models just extract meaningful keywords within the context of a single, isolated post. Whereas tagging really becomes a thing if it can extrapolate meaning across posts.
E.g. I have multiple posts regarding Apple products. Not all of them mention the same exact keywords, but the reader can deduce that they belong to the same group. LLM's miss that completely. So, that would be a logical next step to explore.
Next up, I'm trying LM Studio with a bigger model. The good is that I can do away with most of the tuning code. The downside: processing is slower, and I needed some tuning to make it even slower lest I'm planning to fry an egg on the keyboard.
I tried mistral-7b-instruct-v0.3 first. It yielded great results from the get-go but this is general model. Subjectively, I felt I could do better.
Right now, I'm running aya-23-8b by Cohere Labs which hits the right spot so far. https://huggingface.co/CohereLabs/aya-23-8B
First up, I tried NLTK + KeyBERT. This is a Dutch blog with many flemish idioms. Processing was really fast on my Macbook Pro M3, but I had to do a lot of tuning with stopwords, geonames and spaCy. I've even tried BERTje which is a Dutch pretrained model from the University of Groningen. The end result per iteration kept being below par though.
Lots of cruft, adverbs, interjections, adjectives. Or verbs wrongly classified as nouns and vice versa. The tweaking quickly devolved into spaghetti.
So, I've established that local LLM's as coding agents aren't ready for prime time... But I do have some nice local problems I can still solve.
My personal blog has some 2.464 posts spanning 21 years. I used WordPress for many years before migrating to Hugo. In that transition I ditched the mess of tags.
So, I've spent my Saturday toying and experimenting with local LLM's to generate tags back to my posts.
I took local LLM's for a test drive on my MacBook Pro M3 (18GB RAM). I tried Deepseek and Qwen. I experimented with them in chat and agent mode for a few hours through LM Studio with the Continue plugin for VS Code.
Conclusion: promising, but far from ready for prime time. Some hallucinating, difficult to tweak, limited capabilities. A far cry from Sonnet or Opus. It's poignantly clear why there's a race for chips and hardware.
Note that last paragraph:
> Value will accrue to platforms that can orchestrate workloads across a diverse portfolio of models. Routine, high-frequency tasks must be routed to more efficient small and domain-specific language models, which perform better than generic solutions at a fraction of the cost when aligned to specialized workflows. Expensive inference of frontier-level models must be heavily gated and reserved exclusively for high-margin, complex reasoning tasks.
That's the crux!
Okay. 30 days in and LLM's have burrowed themselves throughout my workflows. I'm sure many are swinging between the ethical considerations and the practical boon to one's productivity.
That said, these are expensive stochastic parrots that give Moore's Law a new lease on life right now. Apparently, inference eats about 80% of the costs to run an AI operation. No wonder chips are a prized commodity.
Then there's Gartner predicting a 90% cost decrease by 2030...
https://www.gartner.com/en/newsroom/press-releases/2026-03-25-gartner-predicts-that-by-2030-performing-inference-on-an-llm-with-1-trillion-parameters-will-cost-genai-providers-over-90-percent-less-than-in-2025
How I do DevOps & IaC.
