whatcani.run의 실사용 데이터(22,914,944 토큰·4,479회·191명)를 바탕으로 M1 Max(64GB)에서 로컬로 돌릴 수 있는 모델 성능을 정리했습니다. llama.cpp·mlx_lm 등으로 측정한 결과, 1B–4B급 모델은 메모리 0.6–4.6GB로 'runs great/well', 4–13GB대 모델은 'runs well/ok', 20–26B급(예: gpt-oss-20b, Gemma 26B)은 11–13GB로 간헐적 실행. Qwen 계열과 Liquid AI 모델이 소형 환경에서 특히 우수했습니다.
So I have been trying the new #Gemma4 models on my M1 macbook pro, specifically the gemma4:26b which is 17gb in size.
Obviously not the most challenging coding challenge and tasks but...
Much much faster response times than local models 6-12 months ago. Previously qwen, deepseek, and even Gemm3 simply took too long to be practical.
I find it incredible this can run on just my 5.5 year old laptop.
Just so we are clear: #LocalLLMs are an asset if trained and used well. But please be aware that many projects are pretending to be open source but their releases contain closed source components where it's not transparent what is going on.
Go to the source. Llama.cpp, PyTorch, etc.
If you are running #LocalLLMs you may be using LM Studio. Just a fair warning.... While this is practical, it's also proxying everything through their infrastructure. It's a privacy nightmare.
Find out which AI models your machine can actually run.
CanIRun.ai — Can your machine run AI models? https://www.canirun.ai/
ht @researchbuzz.bsky.social
This week on The Servitor:
Getting set up for local roleplay with Silly Tavern! These folks take their roleplay companion AI seriously.
https://mastodon.social/@silentexception/116073910238301254
👍
Is there any initiative to pool the various initiatives across European academia and public research to come up with a common #ecosystem of #europeandata and #LocalLLMs ?
Curious if others have found different winners or have tips I missed.