honestly there is something extremely cyberpunk about the idea of being able to message an LLM agent running off solar power over an encrypted local radio mesh network
even when the rest of the power grid is down, together with the cell towers and the fiber backbones
RE: https://bsky.app/profile/did:plc:nbfjoeficjzf3pejpontvril/post/3meq2jkzbvc2mwhaja pick? I'm currently evaluating glm-4.7-flash Q4KM on the framework desktop via llama-cpp-vulkan. ~45 t/s of output
minimax m2.5 UD-Q3_XL, ~20tok/s
here's my nix flake config in case it's helpful, beware it's slop. open to suggestions
gist.github.com/kumavis/0640...
ai.nix
GitHub Gist: instantly share code, notes, and snippets.
Gisti do t use nix but the info in the flake is pretty outdated, gfx1151 is supported by rocm now and running llama.cpp in vulkan mode is much much slower, i recommend you look into that
just make sure the VRAM in bios is set to 512MB and set the GTT to something like 124GB for shared ram/vram