Mastodawn

Ollama is now powered by MLX on Apple Silicon in preview

Ollama is now powered by MLX on Apple Silicon in preview · Ollama Blog

Today, we're previewing the fastest way to run Ollama on Apple silicon, powered by MLX, Apple's machine learning framework.

Show thread

AugSun

"We can run your dumbed down models faster":

#The use of NVFP4 results in a 3.5x reduction in model memory footprint relative to FP16 and a 1.8x reduction compared to FP8, while maintaining model accuracy with less than 1% degradation on key language modeling tasks for some models.