Mastodawn

I recently compared 3 different GPUs with the same Microsoft PHI 4 LLM.

AMD Instinct MI 25 ~ 3 to 5 tokens per second

AMD 9060XT ~ 29-31 tokens per second

NVIDIA 5070 ~ 29 to 34.5 tokens per second