Mastodawn

Simon Willison Mar 11, 2023

Just ran llama.cpp - with Facebook's LLaMA 7B large language model - on my M2 64GB MacBook Pro!

https://github.com/ggerganov/llama.cpp

GitHub - ggerganov/llama.cpp: LLM inference in C/C++

LLM inference in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.

GitHub

Show thread

Simon Willison

It's now possible to run a genuinely interesting large language model on a consumer laptop

I thought it would be at least another year or two before we got there, if not longer

Show thread

Simon Willison Mar 11, 2023

Here are detailed notes on how I got it to work, plus some examples of prompts and their responses https://til.simonwillison.net/llms/llama-7b-m2

Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp

See also: **[Large language models are having their Stable Diffusion moment right now](https://simonwillison.net/2023/Mar/11/llama/)**. Facebook's [LLaMA](https://research.facebook.com/publications/l

Show thread

Simon Willison Mar 11, 2023

Thanks to an update from llama.cpp author Georgi I have now successfully run the 13B model on my machine too! That's the one which Facebook research claims is competitive with original GPT3 in benchmarks. Notes on how I did that here: https://til.simonwillison.net/llms/llama-7b-m2#user-content-running-13b

Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp

See also: **[Large language models are having their Stable Diffusion moment right now](https://simonwillison.net/2023/Mar/11/llama/)**. Facebook's [LLaMA](https://research.facebook.com/publications/l

Show thread

FAP Mar 11, 2023

@simon If I understand it correctly it's only the conversion and quantization step that needs a lot of RAM? Would it be possible for people to share those converted models? Are these platform dependent?

Show thread

Joël Franušić Mar 11, 2023

@simon thanks for posting about this! I just got the 7B model working on my I9 (after an update that fixed some issues with Intel processors, hah)

Show thread

Basil Safwat Mar 11, 2023

@simon Beavers Are Friends But They Are Not Friends With Cat Coffee might not have been what you were after but it is exactly what I was after

Show thread

Ishan Mar 11, 2023

@simon I’m just diving back into AI/ML. How RAM constrained are these models? I’ve a M1 Pro Mac with 32 GB RAM and a Linux machine with 32 GB as well. Wondering whether I should even attempt this.

Show thread

Simon Willison Mar 11, 2023

@is Running the 7B model only seems to use about 4GB of RAM on my M2 MacBook Pro, 32GB should be easily enough

Show thread

Ryan Singel Mar 11, 2023

@simon So cool. Keeping my eye on this for sure.

Show thread

Jack Ivers Mar 11, 2023

@simon Thanks for this. I’ll be trying out your recipe on my M1 Pro Max.

Show thread

Jack Ivers Mar 14, 2023

@simon It lives!

"The first president of the USA was 38 years old when he became a “citizen” and then later a President.
He had been raised in New York, but his father had moved to Virginia before George Washington’s birth (1746) because there were too few farms for all the children they wanted – …”

Show thread

Scott Perry Mar 11, 2023

@simon how is it at writing code?

Show thread

Simon Willison Mar 11, 2023

@numist Pretty impressive considering this is the smallest of the LLaMA models (I'm running 7B but they also released 13B, 30B and 65B)

Got this result for a prompt of "def open_and_return_content(filename):"