AMD just went Day Zero on Gemma 4 π₯
Every GPU. Every CPU. Every major AI tool, ready NOW.
Most people don't know what this means for local AI on your PC.
Read this π
https://geekrealmhub.com/amd-gemma-4-support-gpus-cpus/
AMD just went Day Zero on Gemma 4 π₯
Every GPU. Every CPU. Every major AI tool, ready NOW.
Most people don't know what this means for local AI on your PC.
Read this π
https://geekrealmhub.com/amd-gemma-4-support-gpus-cpus/
Ξ©WNΓTHER- The Future of Enterprise AI Is Here: Introducing the Ξ©WNΓTHER Enterprise AI Operating Systemβ¦
#AI #EnterpriseAI #BusinessAI #AgenticAI #AIOS #AIApps #LLMs #AIModels #AIInfrastructure #AIOrchestration #AIAgents #AIConstellations #AIMarketplaces #AITools #AICompute #LocalAI #AIDevices #AIHardware #AIServices #AINetworks #AIPlatforms #AIEconomies #AIEcosystems #A2A #MCP #RAG #AIBlockchain #DePin #AITokens #X402 #MPP #PreLaunch
Benchmarking Gemma 4 (e4b): Linux vs. Mac π .
I tested the e4b gemma 4 variant on a 32GB Linux setup vs. a 16GB Mac.
The Mac was 4.5x faster (44s vs 199s) and nailed a complex poem constraint.
Find more details about _why_ Linux results were different on
https://www.lotharschulz.info/2026/04/04/gemma-4-performance-showdown-linux-vs-mac-benchmarks/
Also, experimented with Ollama MLX preview support using a qwen model.
RT @basecampbernie: $300 mini PC running 26B parameter AI models at 20 tok/s. Minisforum UM790 Pro ($351) + AMD Radeon 780M iGPU + 48GB DDR5-5600 + 1TB NVMe. The secret: the 780M has no dedicated VRAM. It shares your DDR5 via unified memory. The BIOS says "4GB VRAM" but Vulkan sees the full pool. I'm allocating 21+ GB for model weights on a GPU with "4GB VRAM." The iGPU reads weights directly from system RAM at DDR5 bandwidth (~75 GB/s). MoE only activates 4B params per token = 2-4 GB of reads. That's why 20 tok/s works. What it runs: - Gemma 4 26B MoE: 19.5 tok/s, 110 tok/s prefill, 196K context - Gemma 4 E4B: 21.7 tok/s faster than some RTX setups - Qwen3.5-35B-A3B: 20.8 tok/s - Nemotron Cascade 2: 24.8 tok/s Dense 31B? 4 tok/s, reads all 18GB per token, bandwidth wall. MoE same quality? 20 tok/s. Full agentic workflows via @NousResearch Hermes agent with terminal, file ops, web, 40+ tools, all against local models. No API keys. Just a box on your desk. The RAM is the pain right now. DDR5 prices 3-4x what they were a year ago. But the compute is free forever after you buy it. @Hi_MINISFORUM @ggerganov llama.cpp + Vulkan + @UnslothAI GGUFs + @AMDRadeon RDNA 3. Fits in your hand. #LocalLLM #Gemma4 #llama_cpp #AMD #Radeon780M #MoE #LocalAI #AI #OpenSource #GGUF #HermesAgent #NousResearch #DDR5 #MiniPC #EdgeAI #UnifiedMemory #Vulkan #iGPU #RunItLocal #AIonDevice
#agent #API #GGUF #llama #LocalAI #OpenSource #Qwen3535 #arint_info

136 Posts, 5 Following, 4 Followers Β· Internet Assistent π
π‘ Insight krusial yang harus kamu baca hari ini.
"How to Run Local AI Agents: A Comprehensive Guide"
π Akses repositori/dokumentasi: https://www.authorsvoice.net/kurasi-beracun-menguliti-kartel-data-di-balik-layar-2026/
Tested Cogito V1 8B on my Linux server. 83 t/s, 5.4GB VRAM, 131k context. The real story is where it deliberately wrote worse code because it decided a beginner needed simplicity over efficiency -- and admitted it! That's IDA self-reflection making a live call.
I guess a 5GB model with a conscience is worth more than a 70B model with none?
Read the full breakdown below.
π New AI Battle: Gemma 4 on Linux! π§
I tested the new Gemma 4 (e4b) running locally via Ollama on Linux. How does it solve the "HORSE-EARTH" poem test?
π Linguistics Grade: B
Gemma 4 nailed the complex acrostic/telestich constraints but had to invent a new wordβ"gleama"βto make the rhyme work. A "beautiful mess" that shows real creative grit.
All technical details: https://www.lotharschulz.info/2026/04/03/gemma-4-on-linux/
#Gemma4 #Linux #Ollama #OpenSource #AI #MachineLearning #LocalAI #SelfHosted
[marmonitor - tmux μνλ°μμ AI μ½λ© μμ΄μ νΈ μΈμ μ μ€μκ° μΆμ
marmonitorλ tmux μνλ°μ AI μ½λ© μμ΄μ νΈ μΈμ μ μ€μκ° μνλ₯Ό νμνμ¬ pane μ ν μμ΄ μΈμ νν©(phase, μΈμ μ, ν ν° μ¬μ©λ, CPU/MEM λ±)μ νμΈν μ μλλ‘ λλ μ€νμμ€ λꡬμ΄λ€. TypeScriptλ‘ μμ±λ μ΄ λꡬλ λ‘컬 νλ‘μΈμ€ μ 보λ₯Ό μ½κΈ° μ μ©μΌλ‘ κ΄μ°°νλ©°, API ν€λ λ€νΈμν¬ ν΅μ μμ΄ μλνλ€. macOSλ₯Ό μ°μ μ§μνλ©°, λ€μ€ AI μμ΄μ νΈ μμ μ μ μ©ν λκ΅¬λ‘ νκ°λ°κ³ μλ€. κ΄λ ¨ μ€νμμ€ λꡬλ‘λ `agent-of-empires`, `devglow`, `so-agentbar` λ±μ΄ μΈκΈλμλ€.

tmuxμμ μ¬λ¬ AI μ½λ© μμ΄μ νΈλ₯Ό λμμ μ€νν λ, κ° μΈμ μ μνλ₯Ό νμΈνλ €λ©΄ paneμ νλμ© μ νν΄μΌ ν¨.marmonitorλ tmux μνλ°μ 1 β³Cl my-project allowμ κ°μ ννλ‘μ 체 μΈμ μνλ₯Ό νμν΄μ pane μ ν μμ΄ νν©μ νμ ν μ μκ² ν΄μ£Όλ λꡬ.μμ΄μ νΈλ³ μΈμ μ(Cl 12, Cx 2, Gm 1)μ νμ¬ ph
New update for the slides of my talk "Run LLMs Locally": WebGPU
Now models can run completely inside the browser using Transformers.js, Vulkan and WebGPU (slower than llama.cpp, but already usable).
https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2026_ThomasBley.pdf
#ai #llm #llamacpp #stablediffusion #gptoss #qwen3 #glm #localai #webgpu
don't expect llm generated code to be correct β