Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon
https://github.com/t8/hypura

GitHub - t8/hypura: Run models too big for your Mac's memory
Run models too big for your Mac's memory. Contribute to t8/hypura development by creating an account on GitHub.
GitHubIntel Optane rolling in its grave.