Google should have named the 4 QAT series Gemma 4.1. Most people use quantized models (for a good reason!), and QAT, as verified by WebBrain’s benchmarks, is significantly superior to the original model, just like Qwen 3.6 is superior to Qwen 3.5.

https://www.webbrain.one/blog/gemma-4-31b-qat-planner-benchmark

#LocalLLM
#AI #BrowserAutomation
#WebBrain #OpenSource #LLM #Gemma4 #Gemma #Qwen

Gemma 4 31B QAT becomes the best local Gemma planner we have tested

The QAT w4a16 Gemma 4 31B run improves over the older Gemma 31B int4 result and narrowly beats Qwen 3.6 27B on strict first-action quality.

🔥 We just published our Q4 local planner benchmark comparing local AI models for browser automation:

• DiffusionGemma-26B-A4B-it: 0.35s median, 84% accuracy — fastest!
• Gemma 4 12B Coder: 0.40s median, 84% accuracy
• Cohere North-Mini-Code 1.0: 0.38s median, 84% accuracy

All three tied on accuracy but DiffusionGemma was the fastest.

Full benchmark: https://www.webbrain.one/blog/local-planner-q4-june-2026

#LocalLLM #AI #BrowserAutomation #WebBrain #OpenSource #LLM

DiffusionGemma hits 0.35s median in the WebBrain local planner bench

Gemma 4 12B Coder, North Mini Code, and DiffusionGemma completed WebBrain's frozen local planner run; DiffusionGemma is fast but not yet reliable enough for WebBrain, and VibeThinker is not a tool-calling agent model.