Mastodawn

In the trenches I've been making investments in being able to host and run our own open weight models as I wait for the inevitable rug pull from the big AI companies. Surprisingly if you use the right combination of models and some specific used equipment you can achieve good results.

Show thread

Mike Fraser 1d ago

The AI harness I'm finding out is almost as important as the model in getting real world work done. There are some really great open source projects around this.

Show thread

Mike Fraser 1d ago

I'm finding that my use of AI agents is analogous to how droids are used in the Star Wars universe. They build some stuff and do grunt work I'd rather not do.

Show thread

Matt Hadden 23h ago

@mike This is something I'm accepting as well.

I code for a living and I _loathe_ AI, but you know what I loathe more? Is writing unit tests for my code, and I am not stupid. I know what wins there.

Show thread

John Francis 🇨🇦🦫🍁🫎1d ago

@mike tell me more...this is something I'm thinking of returning to after avoiding the last decade of LLM chatbot slops

Show thread

Mike Fraser 1d ago

@johnefrancis That chat bot era was such a mistake. People getting their first impression of AI from shitty chat bots trying to replace customer service people is where a lot of angst boils up from. That and all the image slop.
Get a an old workstation with a bunch of ram (64gb), like an hp Z series. Buy a GPU with as much vram as you can afford, then run Ohlama. You should be able to find light weight models that will run reasonably.

Show thread

Paul Turnbull 🇨🇦1d ago

@mike This has pretty much my boss’s experience running various models in the office to test things out.

Show thread

Mike Fraser 1d ago

@Chigaze Personally I think this where the real AI revolution come from. People building cool stuff with their own AIs.

Show thread

Kerry Stevenson 1d ago

@mike What gear are u using?

Show thread

Mike Fraser 1d ago

@krst We are using a 4u supermicro chassis with eight GPU bays. I'm adding GPUs as cash flow allows. We also have a new strix halo box that we use for development, etc. So far under 15k in gear. As always more compute would be lovely.

Show thread

Drew Scott Daniels 19h ago

@mike what models and what harnesses?

Show thread

Mike Fraser 16h ago

@drewdaniels I like building tools and automating some workflows. Claude code was a good start but I'm using hermes more and more as my harness. Qwen coder, and Qwen 35b, GLX-Flash, Kimi-k, Gemma are some good models. Look for mixture of experts and a quant that fits your vram.

Show thread

Drew Scott Daniels 16h ago

@mike How do you keep Hermes safe? Do you limit tools? Sandbox? How well does tool use work with 32GB or smaller models?

Show thread

Mike Fraser 14h ago

@drewdaniels You have to be careful about how much agency you give your agents. Make any access you give them read only and if you do elevate permissions make sure you give the specific guard rails. I'm actually working on an agent proxy where you can dynamically grant or revoke specific elements of API access you give them

Show thread

Mike Fraser 14h ago

@drewdaniels I also keep them in their own VM. The local models are usually very good but I supplement them with deepseek v4 because it's insanely cheap right now. Although I fully expect a rug pull.