New update for the slides of my talk "Run LLMs Locally": Bonsai-8B

The latest version of Llama.cpp now supports Vulkan with 1-bit quantized models like Bonsai: 8B model having 1.1 GB in size, 2.5 GB in RAM.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai