Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon
https://github.com/t8/hypura
Run models too big for your Mac's memory. Contribute to t8/hypura development by creating an account on GitHub.