Great keynote from @martin during moodlemoot global 2024. He said something interesting that right now it’s hard to embed open sourced AI in mobile devices. However, just a couple of days ago, Meta released lightweight llama models that can run in mobile devices, the speed at which AI is getting developed is amazing.

https://ai.meta.com/blog/meta-llama-quantized-lightweight-models/

#moodlemootglobal24 #ai

Introducing quantized Llama models with increased speed and a reduced memory footprint

As our first quantized models in this Llama category, these instruction-tuned models retain the quality and safety of the original 1B and 3B models, while achieving 2-4x speedup.

Meta AI

@cesar_s Open model inference CAN run on mobile devices sure. In fact I made prototypes last year!

What I was referring to, though, was the efficiency needed to make them actually usable. It’s not just generating a little text or an image, it’s about having it process all the things you are doing on that device (reading emails, messages, docs etc, watching your screen and so on). To get the required speed (on today’s devices) without draining battery in 20 minutes we’re still dependent on highly optimised OS models integrated by manufacturers (eg Apple, Samsung, Google).

So what I was saying in my keynote was that I was pausing work on an app that included it’s own LLM … for the next few years it’s going to make a lot more sense for an app to leverage the built-in operating system LLMs.

@martin thanks for the clarification! it’s hard indeed to use on-device LLMs for now. Even quantized versions are slow at inference on regular devices. Again, awesome keynote!