Mastodawn

anemll 5d ago

iPhone 17 Pro Demonstrated Running a 400B LLM

https://twitter.com/anemll/status/2035901335984611412

Anemll (@anemll) on X

Running 400B model on iPhone! 0.6 t/s Credit @danveloper @alexintosh @danpacary @anemll

X (formerly Twitter)

Show thread

_air 5d ago

This is awesome! How far away are we from a model of this capability level running at 100 t/s? It's unclear to me if we'll see it from miniaturization first or from hardware gains

Show thread

Tade0

Only way to have hardware reach this sort of efficiency is to embed the model in hardware.

This exists[0], but the chip in question is physically large and won't fit on a phone.

[0] https://www.anuragk.com/blog/posts/Taalas.html

Anemll (@anemll) on X

How Taalas "prints" LLM onto a chip? - Anurag's Blog