Going into the rabbithole of testing local LLMs right now. I don't have a dedicated GPU, but 32 GiB of RAM should be enough for anyone.

#ai #huggingface #selfhost #localai #ollama #heretic #qwen #mistral

Heretic quantized versions of Qwen 3.5 have just been released but even the base Qwen 3.5 model seems to have issue with ollama currently, and I don't have bandwidth to do a manual patch now. Trying Mistral 3.2.

First impressions of Mistral Small 3.2: seems pretty solid, it answers "uncomfortable" political question quite neutrally.

I don't understand why #confer and #euria by #infomaniak are not based on this.

@tomgag how fast does it feel? I tried using foundry local and ollama but at the time I felt slowed down. I’d be keen to swap back to a local model given how the large providers are slowly catching down the subscription token limits.
@sealjay well, I'm running on local CPU with 32 GiB of RAM, so I wouldn't call it "fast". 3-5 tokens per second maybe? I guess it's OK if you give it a task and then go to grab a coffee 😅
@tomgag maybe I’ll check I’m running on renewable energy before I leave a machine running over the weekend then 🤣
@tomgag
Good question! Why is #infomaniak not part of the fediverse?!