Mastodawn

Probably going to swap out Ollama for llama-swap in my local #LLM stuff. Getting a tad cheesed at Ollama, supposedly performance on stock llama.cpp server (which llama-swap Docker contains by default) might be better, we'll see.

#LLM

Show thread

john fink ok!! :goat:2d ago

Things I principally value(d) Ollama for -- multiple models and a model timeout -- are apparently easily done in llama-swap -- thing that remains for me is whether or not my AMD GPU (it sucks, but I would like it to work) will go with it. Ollama just throws in the towel and includes an entire AMD ROCm in their own distrib, which works, but I've had Problems with system-wide stock AMD GPU support. We'll see.

#LLM

Show thread

john fink ok!! :goat:2d ago

somewhat hilarious as AMD is *supposed* to be in the linux kernel! Which is sort of is, I guess. Also got my eye on that apparently reasonably priced Intel GPU (32GB VRAM for ~$1000) that's coming out at some point; llama.cpp amply supports Intel's GPU but afaik Ollama does not. Anyway.

#LLM

Show thread

john fink ok!! :goat:2d ago

One thing I mentioned in my workshop yesterday:

#LLM

Show thread

John Francis 🇨🇦🦫🍁💪⬆️

@adr how far? There are some serious tradeoffs from interacting in any way with large US tech companies. They'll steal your shit for themselves, leak your data and not blink an eye.

Show thread

john fink ok!! :goat:2d ago

@johnefrancis how...far... what? this workshop was all about doing stuff on your own equipment, so none of the pieces *I* supplied fits that (of course, everyone was running US-based operating systems; nothing I can do about that.)

Show thread

john fink ok!! :goat:2d ago

@johnefrancis oh I see. The baseline. Yeah, I was referring to laptops running small models as baseline for local stuff. You could of course build a massive ol' machine and do larger things yourself as well, but that was beyond the scope of the workshop.

Show thread

john fink ok!! :goat:2d ago

@johnefrancis so "how far" depends very much on what you have on hand or can get.

Show thread

John Francis 🇨🇦🦫🍁💪⬆️2d ago

@adr I remember running an early neural net that ran in excel, circa 2006. It only had 3-4 layers and maybe 25 neurons. IIRC you could train it to recognize an integer...not an image of an integer...an actual integer...and at inference it could recognize that this was the one it was trained on. That took about 40h on a nice 2006 developer workstation-replacement laptop. So...not far.

But things have changed a lot.

What useful things can one do on moderate home equipment?

Show thread

John Francis 🇨🇦🦫🍁💪⬆️2d ago

@adr several photo library packages have quite good image recognition "AI". I think a home-LLM could helpa lot with effective search within a home network of devices.

Show thread

john fink ok!! :goat:2d ago

@johnefrancis Sure? I don't know much about the image stuff -- I think Immich does it pretty good though right?

Show thread

john fink ok!! :goat:2d ago

@johnefrancis Best way to answer that is to mess around yourself. Depending on definition of "moderate", I would try something like a 9b - 35b quantized model (the Qwen 3.5 models are good for this) and then try it out with a reasonably simple harness like OpenCode. Ollama does make this easy -- install ollama, ollama pull qwen3.5:35b, ollama launch opencode . Then ask it to make you some program or other.

#LLM