Local LLM App by Ente

https://ente.com/blog/ensu/

Ensu - Ente's Local LLM app

Introducing Ensu, our first step toward a private, personal LLM app that runs on your device and grows with you over time.

ente
Maybe I’m missing it but the page is really light on technical information. Is this a quantized / distilled model of a larger LLM? Which one? How many parameters? What quantization? What T/s can I expect? What are the VRAM requirements? Etc etc
I tried it on my iPhone 13 mini. I believe the model you get changes depending on your phone specs. For me it downloaded a ~1.3GB model which can speak in complete sentences but can’t do much beyond that. Can’t blame them though—that model is tiny, and my device wasn’t designed for this.

You can see what it uses here - https://github.com/ente-io/ente/blob/main/web/apps/ensu/src/...

Either LFM2.5-1.6B-4bit or Qwen3.5-2B-8bit or Qwen3.5-4B-4bit

ente/web/apps/ensu/src/services/llm/provider.ts at main · ente-io/ente

💚 End-to-end encrypted cloud for everything. Contribute to ente-io/ente development by creating an account on GitHub.

GitHub

Hmm, the Mac app downloaded gemma-3-4b-it-Q4_K_M.gguf for me (on an Apple M4) - maybe the desktop apps download different models?

Though, I don't see any references to Gemma at all in the open source code...

I have the same questions. After installing the app, it downloads 2.5 GB of data. I presume this is the model.

There's dozens of local inference apps that basically wrap llama.cpp and someone else's GGUFs. The decentralized sync history part seems new? Not much else. But the advertisement copy is so insufferably annoying in how it presents this wrapper as a product.

Have a comparison chart to Ollama, LMStudio, LocalAI, Exo, Jan.AI, GPT4ALL, PocketPal, etc.

There are so many wrappers that are obviously wrappers. I wonder if part of the value proposition here is that it is “like a product.” I have no idea if they actually achieve that, though, and doubt it really could be proven on a site.

Given how the blog is presented, I assumed this was something novel that solved a unique problem, maybe a local multi-modal assistant for your device.

I installed it and it's none of that. It is a mere wrapper around small local LLM models. And, it's not even multi-modal! Anyone could've one-shotted this in Claude in an hour (I'm not exaggerating).

What's the target audience here? Your average person doesn't care about the privacy value proposition (at least not by severely sacrificing chat model's quality). And users who do want that control can already install LMStudio/Llama.cpp (which is dead simple to setup).

The actual release product should've been what's described in "What's next" section.

> Instead of general chat, we shape Ensu to have a more specialized interface, say like a single, never-ending note you keep writing on, while the LLM offers suggestions, critiques, reminders, context, alternatives, viewpoints, quotes. A second brain, if you will.

> A more utilitarian take, say like an Android Launcher, where the LLM is an implementation detail behind an existing interaction that people are already used to.

> Your agent, running on your phone. No setup, no management, no manual backups. An LLM that grows with you, remembers you, your choices, manages your tasks, and has long-term memory and personality.

> Anyone could've one-shotted this in Claude in an hour (I'm not exaggerating).

This probably could have been one-shotted with Sonnet, not even Opus. Given how over indexed they are on LLM coding, Haiku might even be able to do it.

This is actually an interesting coding model benchmark task now that I think about it.

I would love to see a "distributed LLM" system, where people can easily setup a system to perform a "piece" of a "mega model" inference or training. Kind of like SETI@home but for an open LLM (like https://github.com/evilsocket/cake but massive )

Ideally if you "participate" in the network, you would get "credits" to use it proportionally to how much GPU power you have provided to the network. Or if you can't, then buy credits (payment would be distributed as credits to other participants).

That way we could build huge LLMs that area really open and are not owned by any network.

I would LOVE to participate in building that as well.

Oh yeah, and maybe call it "SkyNet" or something.

As someone who saw this and was interested but also skeptical of this being low effort are there other open projects for running small models locally on android / iOS?

I've found https://github.com/alichherawalla/off-grid-mobile-ai but haven't tried anything in this space yet.

GitHub - alichherawalla/off-grid-mobile-ai: The Swiss Army Knife of Offline AI. Chat, Speak, and Generate Images - Privacy First, Zero Internet. Download an LLM and use it on your mobile device. No data ever leaves your phone. Supports text-to-text, vision, text-to-image

The Swiss Army Knife of Offline AI. Chat, Speak, and Generate Images - Privacy First, Zero Internet. Download an LLM and use it on your mobile device. No data ever leaves your phone. Supports text-...

GitHub