Tested Google's Gemma3 12B QAT on my home Linux server. Stable 97% GPU utilization, no CPU spill, no logic errors. Mistral Nemo 12B beats it on speed & uses 2 GB less VRAM. Those extra 2 gig could run a second model on a 16GB card.
Gemma 12B is correct, thorough and about as warm as a DMV waiting room.

Full breakdown below.

#AI #LocalAI #OpenSource #Gemma #MachineLearning

https://goarcherdynamics.com/2026/03/13/aihome-gemma-3-12b/?utm_source=mastodon&utm_medium=jetpack_social

AI@Home – Gemma 3 12B

Conditions & context Today we are diving into a quick test of Google’s Gemma3, again a QAT quantized model, but this time a 12B. So, let’s hope that those 12 billion parameters do a…

Archer Dynamics