Just ran llama.cpp - with Facebook's LLaMA 7B large language model - on my M2 64GB MacBook Pro!

https://github.com/ggerganov/llama.cpp

GitHub - ggerganov/llama.cpp: LLM inference in C/C++

LLM inference in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.

GitHub

It's now possible to run a genuinely interesting large language model on a consumer laptop

I thought it would be at least another year or two before we got there, if not longer

Here are detailed notes on how I got it to work, plus some examples of prompts and their responses https://til.simonwillison.net/llms/llama-7b-m2
Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp

See also: **[Large language models are having their Stable Diffusion moment right now](https://simonwillison.net/2023/Mar/11/llama/)**. Facebook's [LLaMA](https://research.facebook.com/publications/l

Thanks to an update from llama.cpp author Georgi I have now successfully run the 13B model on my machine too! That's the one which Facebook research claims is competitive with original GPT3 in benchmarks. Notes on how I did that here: https://til.simonwillison.net/llms/llama-7b-m2#user-content-running-13b
Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp

See also: **[Large language models are having their Stable Diffusion moment right now](https://simonwillison.net/2023/Mar/11/llama/)**. Facebook's [LLaMA](https://research.facebook.com/publications/l

@simon I’m just diving back into AI/ML. How RAM constrained are these models? I’ve a M1 Pro Mac with 32 GB RAM and a Linux machine with 32 GB as well. Wondering whether I should even attempt this.
@is Running the 7B model only seems to use about 4GB of RAM on my M2 MacBook Pro, 32GB should be easily enough