Mastodawn

Gemma 4 on iPhone

https://apps.apple.com/nl/app/google-ai-edge-gallery/id6749645337

Google AI Edge Gallery-app - App Store

Download Google AI Edge Gallery van Google in de App Store. Bekijk schermafbeeldingen, beoordelingen en recensies, gebruikerstips en meer apps zoals Google AI…

App Store

Show thread

janandonly

OP Here. It is my firm belief that the only realistic use of AI in the future is either locally on-device for almost free, or in the cloud but way more expensive then it is today.

The latter option will only bemusedly for tasks that humans are more expensive or much slower in.

This Gemma 4 model gives me hope for a future Siri or other with iPhone and macOS integration, “Her” (as in the movie) style.

Show thread

kennywinker 2d ago

Did you really watch “Her” and think this is a future that should happen??

Seriously????

Show thread

jfreds 2d ago

I don’t think OP’s point has anything to do with AI companions.

The big benefit of moving compute to edge devices is to distribute the inference load on the grid. Powering and cooling phones is a lot easier than powering and cooling a datacenter

Show thread

sambapa 2d ago

Torment Nexus sounds fun

Show thread

aninteger 2d ago

Having Scarlett Johansson's voice might not be so bad or even something less robotic.

Show thread

0dayman 2d ago

this is not that first step towards your dream

Show thread

crazygringo 2d ago

> or in the cloud but way more expensive then it is today.

Why? It's widely understood that the big players are making profit on inference. The only reason they still have losses is because training is so expensive, but you need to do that no matter whether the models are running in the cloud or on your device.

If you think about it, it's always going to be cheaper and more energy-efficient to have dedicated cloud hardware to run models. Running them on your phone, even if possible, is just going to suck up your battery life.

Show thread

nothinkjustai 2d ago

> It's widely understood that the big players are making profit on inference.

Are they? Or are they just saying that to make their offerings more attractive to investors?

Plus I think most people using agents for coding are using subscriptions which they are definitely not profitable in.

Locally running models that are snappy and mostly as capable as current sota models would be a dream. No internet connection required, no payment plans or relying on a third party provider to do your job. No privacy concerns. Etc etc.

Show thread

zozbot234 2d ago

You can pick models that are snappy, or models that are as capable as SOTA. You don't really get both unless you spend extremely unreasonable amounts of money on what is essentially a datacenter-scale inference platform of your own, meant to service hundreds of users at once. (I don't care how many agent harnesses you spin up at once, you aren't going to get the same utilization as hundreds of concurrent users.)

This assessment might change if local AI frameworks start working seriously on support for tensor-parallel distributed inference, then you might get away with cheaper homelab-class hardware and only mildly unreasonable amounts of money.

Show thread

zozbot234 2d ago

The big players are plausibly making profits on raw API calls, not subscriptions. These are quite costly compared to third-party inference from open models, but even setting that up is a hassle and you as a end user aren't getting any subsidy. Running inference locally will make a lot of sense for most light and casual users once the subsidies for subscription access cease.

Also while datacenter-based scaleout of a model over multiple GPUs running large batches is more energy efficient, it ultimately creates a single point of failure you may wish to avoid.

Show thread

mbesto 2d ago

> It's widely understood that the big players are making profit on inference.

This is most definitely not widely understood. We still don't know yet. There's tons of discussions about people disagreeing on whether it really is profitable. Unless you have proof, don't say "this is widely understood".

Show thread

huijzer 2d ago

Laptop/desktop could work. Most systems are on charger most of time anyway

Show thread

jrflowers 2d ago

> It's widely understood that the big players are making profit on inference.

I love the whole “they are making money if you ignore training costs” bit. It is always great to see somebody say something like “if you look at the amount of money that they’re spending it looks bad, but if you look away it looks pretty good” like it’s the money version of a solar eclipse

Show thread

amelius 2d ago

A local model running on a phone owned and controlled by the vendor is still not really exciting, imho.

It may be physically "local" but not in spirit.