As I experiment with running local #llm on my #framework desktop, having 128GB of ram certainly gives you lots of options. I can run some large models, but they're generally quite slow.
As I experiment with running local #llm on my #framework desktop, having 128GB of ram certainly gives you lots of options. I can run some large models, but they're generally quite slow.
My current set-up is to run two 'smaller' models simultaneously, a planning and coding model.
Qwen3-32B is my 'planner' model, which has good reasoning/instruction following capabilities.
Qwen3-Coder-30B-A3B is my 'coding' model, for coding, tool calling and debugging.
I'm running #opencode in the terminal, which by default has two primary agents, plan and build agent. This setup pairs nicely with that.
@ehippy can't say that I am specifically. ohmyopencode supports configuring models by task types, so a local model for tasks categorized as 'quick' would be fine, while you'd have an Opus model orchestrating things.
Are you aware of anything like you're suggesting?