Jun Kim (@jundotkim)

oMLX 0.3.9rc1이 릴리스됐다. 저메모리 Mac에서 OS 종료 방지 안정성 개선, DFlash v0.1.7 반영, Qwen의 thinking/GDN 수정, 긴 프롬프트가 다른 요청의 디코드를 막지 않도록 하는 chunked prefill 지원이 핵심이다. 로컬 LLM/MLX 운영에 실용적인 업데이트다.

https://x.com/jundotkim/status/2056666955466383521

#omlx #mlx #localllm #mac #inference

Jun Kim (@jundotkim) on X

oMLX 0.3.9rc1 released. Highlights: - Low-memory Macs stay stable instead of getting killed by the OS - DFlash bumped to v0.1.7 (thanks to @bstnxbt's dflash-mlx). Qwen thinking/GDN fix, Etc. - Chunked prefill. A long prompt no longer blocks decode for everyone else -

X (formerly Twitter)

Did you know local LLMs can cost up to $25,000 annually in hidden expenses? The truth isn't in the code — it's in the bills. Full piece: https://jindoprompt.com/lock/how-to-run-local-llms-with-zero-data-leaks-1

#ai #localllm #privacy

12 tokens/sec on a 2024 MacBook Air. That's faster than you can read this sentence. Local AI is not the slow option anymore — that was 2023. The new constraint isn't speed. It's whether you trust your laptop more than you trust a server farm in Iowa.

#ai #localllm #privacy

LockStack installs in 90 seconds. No Docker. No Python. No CUDA drivers. Double-click the .exe, pick a folder, you're generating local AI content. The setup is the boring part — and we made it short on purpose.

#ai #localllm #productivity

What runs LockStack on your laptop: 8GB RAM (16GB comfy), 4-core CPU from 2018+, 5GB disk, no GPU required. 12 tokens/sec on a 2024 MacBook Air — faster than you can read. Llama 3.2 3B Q8, 100% on-device. lockstack.net

#ai #localllm #llama

Honest take: local AI isn't always the answer. GPT-4 reasoning → cloud wins. 200K-context legal review → cloud wins. Marketing copy on NDA'd decks → local wins. Drafts referencing private client info → local wins. LockStack is built for the second list.

#ai #localllm #privacy

Cancelled $80/mo in AI subs last month. The math: 3 ChatGPT Plus = $60, 1 Claude Pro = $20. Replaced with LockStack at $147 once. Llama 3.2 3B Q8 handles marketing copy fine. Pays back month 2. Local AI isn't always cheaper — but when it is, it really is.

#ai #localllm #saas

What runs LockStack on your laptop: 8GB RAM (16GB comfy), 4-core CPU from 2018+, 5GB disk, no GPU required. 12 tokens/sec on a 2024 MacBook Air — faster than you can read. Llama 3.2 3B Q8, 100% on-device. lockstack.net

#ai #localllm #llama

Honest take: local AI isn't always the answer. GPT-4 reasoning → cloud wins. 200K-context legal review → cloud wins. Marketing copy on NDA'd decks → local wins. Drafts referencing private client info → local wins. LockStack is built for the second list.

#ai #localllm #privacy

Cancelled $80/mo in AI subs last month. The math: 3 ChatGPT Plus = $60, 1 Claude Pro = $20. Replaced with LockStack at $147 once. Llama 3.2 3B Q8 handles marketing copy fine. Pays back month 2. Local AI isn't always cheaper — but when it is, it really is.

#ai #localllm #saas