Ollama가 Apple의 ML 프레임워크 MLX 기반으로 Apple Silicon(M5/M5 Pro/M5 Max)에서 미리보기로 가속됩니다. Qwen3.5-35B-A3B에서 prefill·decode 속도 크게 향상되고 NVFP4 양자화로 생산 환경과 동등한 품질 유지가 가능해졌습니다. 캐시 재사용·스마트 체크포인트·스마트 삭제로 응답성·메모리 효율 개선. Ollama 0.19 공개(통합메모리 32GB 권장).
Ollama가 Apple의 ML 프레임워크 MLX 기반으로 Apple Silicon(M5/M5 Pro/M5 Max)에서 미리보기로 가속됩니다. Qwen3.5-35B-A3B에서 prefill·decode 속도 크게 향상되고 NVFP4 양자화로 생산 환경과 동등한 품질 유지가 가능해졌습니다. 캐시 재사용·스마트 체크포인트·스마트 삭제로 응답성·메모리 효율 개선. Ollama 0.19 공개(통합메모리 32GB 권장).
Ollama 0.19, MLX 탑재로 Mac에서 AI 추론 속도 2배 빨라졌다
Ollama 0.19가 Apple MLX 프레임워크를 탑재해 Mac에서 AI 추론 속도를 최대 2배 향상. NVFP4 지원과 캐시 개선도 포함한 주요 업데이트를 소개합니다.🛠️ Ollama: Native MLX Backend for Apple Silicon
Ollama now runs on Apple MLX natively. On M5 Max + Qwen3.5-35B-A3B: 1851 tok/s prefill, 134 tok/s decode. Also adds NVFP4 quantization for production parity with NVIDIA inference and improved KV cache reuse for agentic workloads.
solomonneas.dev/intel
Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework.
This change unlocks much faster performance to accelerate demanding work on macOS:
- Personal assistants like OpenClaw
- Coding agents like Claude Code, OpenCode, or Codex

Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode,
Well, the kids have to learn about TekWar sooner or later
https://macsourceports.com/game/tekwar
#macgaming #tekwar #williamshatner #jftekwar #macOS #macbookneo #gaming #retrogaming #apple #applesilicon
Ollama just added MLX support for Apple Silicon chips. This is huge for anyone wanting to run AI models locally.
MLX is Apple's machine learning framework that's optimized for M-series chips. Your MacBook can now run models like Llama 3 faster while using less power.
Ollama is now powered by MLX on Apple Silicon in preview