Tested Google's Gemma3 12B QAT on my home Linux server. Stable 97% GPU utilization, no CPU spill, no logic errors. Mistral Nemo 12B beats it on speed & uses 2 GB less VRAM. Those extra 2 gig could run a second model on a 16GB card.
Gemma 12B is correct, thorough and about as warm as a DMV waiting room.
Full breakdown below.
