Plugable TBT5-AI enclosure lets Windows laptops run local AI with a desktop GPU

https://fed.brid.gy/r/https://nerds.xyz/2026/03/plugable-tbt5-ai-enclosure/

One more update for the slides of my talk "Run LLMs Locally":

Now including text to speech with Qwen3-TTS and Model Context Protocol.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2025_ThomasBley.pdf

#llm #llamacpp #ollama #stablediffusion #gptoss #qwen3 #glm #opencode #localai #mcp

I updated the slides for my talk "Run LLMs Locally":

Now including image generation with Qwen3 and content classification from the Qwen3Guard Technical Report paper.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2025_ThomasBley.pdf

#llm #llamacpp #ollama #stablediffusion #gptoss #qwen3 #glm #opencode #localai

@john #ollama is garbage, #qwen3_5 has many fixes in #llamacpp recently, it is not fully ready yet

Just a NixOS Phone running a reasoning LLM locally...

#nixos #llm #llamacpp

Less than 2 weeks until Embedded World & I will be at large on the expo floor! Let's chat about all things Go with microcontrollers, computer vision, & machine learning. Just look for Gopherbot.

#golang #tinygo #ew26 #embedded #computerVision #ml #openCV #llamacpp #yzma

New talk coming tomorrow: Run LLMs locally

I will present it @phpugmrn in Mannheim.

#llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #ocr #localai #security

金のニワトリ (@gosrum)

Qwen3.5-27B-UD-Q4_K_XL을 llama.cpp로 추론 속도 평가한 결과, 모델이 VRAM에 올라갈 경우 RTX 5090이 매우 빠름을 확인. RTX 5090(1장) Prefill 약 2800 tps, Decode 약 60 tps. M2 Ultra(2장) Prefill 약 256 tps, Decode 약 18 tps.

https://x.com/gosrum/status/2026450569695830360

#qwen #llamacpp #benchmarking #rtx5090

金のニワトリ (@gosrum) on X

まずはQwen3.5-27B-UD-Q4_K_XLのllama.cppの推論速度を評価 →やはりVRAMに乗り切る場合、RTX5090はめちゃくちゃ速い!! ●RTX 5090(1枚め) ・Prefill:〜2800tps ・Decode:〜60tps ●M2 Ultra(2枚め) ・Prefill:〜256tps ・Decode:〜18tps

X (formerly Twitter)

GGML·llama.cpp, Hugging Face 합류, 로컬 추론 오픈소스 단일화

llama.cpp 제작팀 GGML이 Hugging Face에 합류. transformers와 llama.cpp 통합 가속화로 로컬 AI 오픈소스 생태계의 큰 변화를 소개합니다.

https://aisparkup.com/posts/9490

yzma 1.10 is out with improvements like:
- install info for @officialarduino.bsky.social UNO Q and @raspberrypi.com
- experimental 'VLM' type
- improved yzma cmd so 'go install' works with latest

Go and get it!

https://github.com/hybridgroup/yzma

#golang #llama #llamacpp #ml #llm #vlm #cuda #vulkan