The world's first inference server with Gemma 4 and TurboQuant:

brew tap ericcurtin/inferrs
brew install inferrs
inferrs run --turbo-quant google/gemma-4-E2B-it

https://github.com/ericcurtin/inferrs

Gemma 4 and TurboQuant coming to Docker Model Runner soon. Gemma 4 is available on Docker Hub.

GitHub - ericcurtin/inferrs: A TurboQuant inference engine

A TurboQuant inference engine. Contribute to ericcurtin/inferrs development by creating an account on GitHub.

GitHub