honestly there is something extremely cyberpunk about the idea of being able to message an LLM agent running off solar power over an encrypted local radio mesh network even when the rest of the power grid is down, together with the cell towers and the fiber backbones

RE: https://bsky.app/profile/did:plc:nbfjoeficjzf3pejpontvril/post/3meq2jkzbvc2m
running a local model?
yup
whaja pick? I'm currently evaluating glm-4.7-flash Q4KM on the framework desktop via llama-cpp-vulkan. ~45 t/s of output
minimax m2.5 UD-Q3_XL, ~20tok/s
here's my nix flake config in case it's helpful, beware it's slop. open to suggestions gist.github.com/kumavis/0640...
ai.nix

GitHub Gist: instantly share code, notes, and snippets.

Gist
i do t use nix but the info in the flake is pretty outdated, gfx1151 is supported by rocm now and running llama.cpp in vulkan mode is much much slower, i recommend you look into that just make sure the VRAM in bios is set to 512MB and set the GTT to something like 124GB for shared ram/vram