New week, more slides: Run LLMs Locally

Now with LFM 2 and new slides for using Transformers.js with WebGPU for Privacy Filter, Function Calling and Embeddings, running completely in your browser.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

TUTORIAL - Passo a passo: IA local no Linux com LM Studio

Neste vídeo, você vai aprender como rodar inteligência artificial localmente no seu Linux usando o LM Studio.

Se você quer mais controle, mais privacidade e rodar IA direto no seu computador, esse guia é pra você.

Link: https://youtu.be/M7jR2BIuGyQ

#debian #debian13 #lmstudio #ialocal #ibmgranite #gptoss #googlegemma #qwen #rx550 #linux

Petri Kuittinen (@KuittinenPetri)

모델 선택 조언 트윗으로, GLM-4.7-Flash와 gpt-oss-120b 대신 Gemma-4-31B와 Qwen3.6-27B가 더 낫다고 평가하며, GLM-5.1은 M5 Max MacBook Pro의 128GB RAM 한계상 실행이 어렵다고 언급한다.

https://x.com/KuittinenPetri/status/2050859130810695850

#glm #gemma #qwen #gptoss #llm

Petri Kuittinen (@KuittinenPetri) on X

@AiXsatoshi You have otherwise very good list of models, but I would just skip GLM-4.7-Flash and gpt-oss-120b. Gemma-4-31B and Qwen3.6-27B are both better then those. And how are you going to fit GLM-5.1 into M5 Max MacBook Pro as that Apple device has max 128 GB RAM?

X (formerly Twitter)

New week, new slides: Run LLMs Locally

Now including Nemotron 3 Nano Omni from Nvidia, Llama.cpp built-in tools and new slides about using Transformers.js with WebGPU for Image Recognition and OCR.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4 #nemotron #webgpu

🔧 Fine-tunable on domain-specific data — adapts to medical, legal or enterprise environments where generic rules fail. Based on the open #gptoss model family. Available on #HuggingFace under Apache 2.0

🚨 Caveat: #PrivacyFilter is a redaction & data minimization aid — NOT a compliance guarantee. It should be one layer in a holistic #privacybydesign approach. Always combine with human review for high-stakes use cases
https://openai.com/index/introducing-openai-privacy-filter/

Introducing OpenAI Privacy Filter

OpenAI Privacy Filter is an open-weight model for detecting and redacting personally identifiable information (PII) in text with state-of-the-art accuracy

OpenAI

New week, new update for the slides of my talk "Run LLMs Locally":

Now including Gemma4 and Qwen3-Omni with Vision and Audio support and new slides describing Llama.cpp server parameters.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #gemma4

New update for the slides of my talk "Run LLMs Locally": Bonsai-8B

The latest version of Llama.cpp now supports Vulkan with 1-bit quantized models like Bonsai: 8B model having 1.1 GB in size, 2.5 GB in RAM.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai

New update for the slides of my talk "Run LLMs Locally": WebGPU

Now models can run completely inside the browser using Transformers.js, Vulkan and WebGPU (slower than llama.cpp, but already usable).

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #webgpu

don't expect llm generated code to be correct ↓

New update for the slides of my talk "Run LLMs Locally":

Now including music generation with ACE-Step and OCR using LightOnOCR.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf

#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai

Migliori LLM locali del 2026: usali con Ollama o LM Studio

https://www.risposteinformatiche.it/migliori-modelli-llm-locali-2026-ollama-lm-studio/