Mastodawn

—BUT—through Accelerate, matrix will be executed on Apple’s AMX(s).*

*or perhaps on the NPU for lower precision or the GPU, as Accelerate “efficiently” determines (presumably)