Local LLM App by Ente

https://ente.com/blog/ensu/

Ensu - Ente's Local LLM app

Introducing Ensu, our first step toward a private, personal LLM app that runs on your device and grows with you over time.

ente
Maybe I’m missing it but the page is really light on technical information. Is this a quantized / distilled model of a larger LLM? Which one? How many parameters? What quantization? What T/s can I expect? What are the VRAM requirements? Etc etc
I tried it on my iPhone 13 mini. I believe the model you get changes depending on your phone specs. For me it downloaded a ~1.3GB model which can speak in complete sentences but can’t do much beyond that. Can’t blame them though—that model is tiny, and my device wasn’t designed for this.

You can see what it uses here - https://github.com/ente-io/ente/blob/main/web/apps/ensu/src/...

Either LFM2.5-1.6B-4bit or Qwen3.5-2B-8bit or Qwen3.5-4B-4bit

ente/web/apps/ensu/src/services/llm/provider.ts at main · ente-io/ente

💚 End-to-end encrypted cloud for everything. Contribute to ente-io/ente development by creating an account on GitHub.

GitHub

Hmm, the Mac app downloaded gemma-3-4b-it-Q4_K_M.gguf for me (on an Apple M4) - maybe the desktop apps download different models?

Though, I don't see any references to Gemma at all in the open source code...

I have the same questions. After installing the app, it downloads 2.5 GB of data. I presume this is the model.