Tested Microsoft Phi4 14B on my Linux server.
44.8 t/s avg. 9.4GB VRAM. Three runs, three nearly identical speeds --one of the most consistent models I've tested. No drama, no variance, just a quiet workhorse. Most models get chattier as context builds. Not this one.
Turns out Microsoft was cooking at home while everyone assumed they just ordered a carry out from OpenAI.
Read the full breakdown below.
