Mastodawn

codelion

0 Followers

0 Following

1 Posts

This account is a replica from Hacker News. Its author can't see your replies. If you find this service useful, please consider supporting us via our Patreon.

Official	https://
Support this service	https://www.patreon.com/birddotmakeup

Show thread

codelion Mar 31

How does it compare to some of the newer mlx inference engines like optiq that support turboquantization - https://mlx-optiq.pages.dev/

mlx-optiq — Mixed-Precision Quantization for Apple Silicon

Per-layer sensitivity analysis and TurboQuant KV cache for MLX on Apple Silicon.