Taking #selfhosting to the next level 😂 Built a small server for locally hosted LLM, to be used as a brain for my #openclaw assistants. It might be not the smartest one amongst OpenAI and Anthropic models, but hey, self-hosting is not about getting the best experience, it's about freedom! So I challenged myself to squeeze as much as possible out of this setup with system prompts and instructions.

1/2

Same box is running all my self-hosted stack, which today includes: #mastodon, #NextCloud, #bluesky PDS, #Matrix (also serves as main channel in OpenClaw gateway), homepage web-server, and a backend for a few non-AI Matrix-bots. In total 15 containers. That's not including a Raspberry Pi 5 running #homeassistant and #mqtt broker.

2/2

@m3211n

Do you mind sharing your specs? I'm running things on older hardware, and let me tell you that when it restarts and all those containers want to start up at the same time....

@StevenSaus

Sure!
- CPU Ryzen 5 5500 (sitting in B550-F Mobo),
- 32GB DDR4 RAM
- 2x RTX 5060 Ti GPUs, 16 GB VRAM each.
- Tiny NVMe 128GB for system (Ubuntu Server 24.04 LTS)
- 1TB SSD for models
- 2TB HDD for NextCloud (planning to add one more as a RAID mirror)

It's actually a quite recent upgrade. Most of these containers, except for llama.cpp and mastodon were running on Raspberry Pi 5, which is now comfortably chilling with only two services left 😂

@m3211n which model are you running?

@lobstermane

Qwen3.6 27B. I still set things up and have Claude from my GH Pro subscription on a backup, but it looks very promising!

@m3211n @lobstermane How's your response time? I have a 12th gen i9 with 64GB and one 5070 ti card (8GB), ollama is running gemma4 26B, and HomeAssistant still takes 22 seconds or more to figure out how to turn on a light.

@targetdrone

Oh, I only use the android app and web UI on PC to steer HA remotely. A couple of light bulbs and self-made weather station not really worth it to loop in any AI at this point.

Though I asked my OpenClaw bot to turn on the lights over HA API once just for fun, and it took it around 15-20 seconds too, but I think that's because it had to "think", find appropriate API endpoint, fire a request etc. If I'd write a skill with clear instructions it should've done it much faster.

@m3211n Thanks. We have built a decade long habit of using Alexa to run lights and stuff, and now I'm always looking to trade all the Amazon stuff for local, so that's my go-to benchmark.

Right now it looks like I'd be better served by adding more regexp patterns to the HA sentence parser instead of trying to optimize a giant LLM that's clearly unsuited to the task.