cocktail peanut (@cocktailpeanut)

DS4-WebUI라는 최소한의 웹 UI가 소개됐다. @antirez의 ds4.c 서버를 감싸는 형태이며, 작은 모델로도 추론 능력이 인상적이라고 강조한다. 다만 사용하려면 최소 128GB RAM이 탑재된 Apple Silicon Mac이 필요하다.

https://x.com/cocktailpeanut/status/2053193902694256758

#localai #webui #offline #reasoning #macos

cocktail peanut (@cocktailpeanut) on X

Introducing DS4-WebUI I made a minimal Web UI that wraps the ds4.c server from @antirez --- and the reasoning is...just watch this video (and this is the smaller model) DS4.c probably is better at reasoning than us. Caveat: you need at least 128GB RAM Apple Silicon Macs.

X (formerly Twitter)

Modly는 Lightning Pixel이 만든 로컬 GPU 전용 오픈소스 데스크톱 앱으로, 사진을 기반으로 3D 메쉬를 생성합니다. Windows·Linux 지원(곧 macOS). 확장형 모델 시스템으로 Hunyuan3D·TripoSG·Trellis2 등 외부 모델을 설치해 사용 가능하며, 설치 파일은 Releases에서, 코드도 깃 클론으로 실행. MIT 라이선스(저작자 표시 필요), 커뮤니티는 Discord.

https://github.com/lightningpixel/modly

#ai #3d #opensource #localai #desktopapp

GitHub - lightningpixel/modly: Desktop app to generate 3D models from images using local AI — runs entirely on your GPU

Desktop app to generate 3D models from images using local AI — runs entirely on your GPU - lightningpixel/modly

GitHub

Show HN: Local AI search for your video library (local, open source)

Edit Mind는 사용자의 영상 라이브러리를 로컬에서 AI로 분석해 모든 장면, 객체, 음성 등을 검색 가능하게 하는 오픈소스 및 데스크탑 앱 솔루션입니다. Apple Silicon과 NVIDIA GPU를 지원하며, 영상의 모든 프레임을 분석해 텍스트, 시각, 오디오 신호를 융합한 멀티모달 벡터로 저장해 개인정보를 보호합니다. 데스크탑 앱은 Final Cut Pro, DaVinci Resolve와 연동되어 편집 흐름을 끊지 않고 바로 클립을 타임라인에 전송할 수 있습니다. 영상 편집자들이 대용량 영상에서 원하는 순간을 빠르게 찾고 편집 효율을 높이는 데 즉시 활용 가능한 도구입니다.

https://edit-mind.com

#localai #videosearch #opensource #aieditor #multimodal

Edit Mind - Your Video Knowledge Base and Companion

Local-first AI video knowledge base. Search every frame with natural language, send clips to your NLE, and never lose track of your footage again. Designed for video editors, content creators, and visual storytellers.

This post is an exploration of that mess, and hopefully one small step toward making the boring version possible.

https://gurupanguji.com/blog/2026/05/09/local-models-inference-incantations-and-pi-extensions/

#AI #LocalAI
2/2

Local models, inference incantations and pi extensions // @gurupanguji

Putting an API key into Pi and using a hosted model is a very boring operation. You select the provider, paste the key and then you are done thinking abou...

Ars Technica: Chrome’s 4GB AI model isn’t new, but you’re not wrong for being confused. “Some desktop Chrome users have also noted that the browser appears to suddenly want more storage space for AI. This is true—Chrome does download a 4GB AI model for on-device processing. It’s been doing that for years, though. Google hasn’t actually changed anything about Chrome’s on-device AI, […]

https://rbfirehose.com/2026/05/09/ars-technica-chromes-4gb-ai-model-isnt-new-but-youre-not-wrong-for-being-confused/
Ars Technica: Chrome’s 4GB AI model isn’t new, but you’re not wrong for being confused

Ars Technica: Chrome’s 4GB AI model isn’t new, but you’re not wrong for being confused. “Some desktop Chrome users have also noted that the browser appears to suddenly want more storage space…

ResearchBuzz: Firehose

How to Replace Siri with a Free Local Model

Explain the difference between local AI and cloud AI in simple terms

#LocalAI is processed on your device, keeping all data private.
#CloudAI is processed on a server and requires internet access.

https://app.therundown.ai/guides/how-to-replace-siri-with-a-free-local-model

#LocallyAI #gemma #gemma4 #llm #ai

How to Replace Siri with a Free Local Model | Rundown Guides

https://insiderllm.com/guides/best-local-llms-translation/

interesting down-to-earth, practical guide to local LLMs workflows, here for translation

#localAI #localLLM #selfhosting #aitranslation

A $1,999 Mac mini runs a 70B parameter model that a $4,000 Windows workstation physically cannot.
The reason: Apple Silicon's unified memory. No separate VRAM pool. No PCIe bottleneck. Just one shared memory for CPU, GPU, and Neural Engine.
Full breakdown: https://www.buysellram.com/blog/why-mac-mini-is-the-surprising-frontrunner-for-local-ai-agents/

#ArtificialIntelligence #AI #LocalAI #MacMini #AppleSilicon #LLM #AIAgents #MachineLearning #EdgeAI #TechInfrastructure #DataPrivacy #Automation #AIHardware

Why Mac mini Is the Surprising Frontrunner for Local AI Agents

Why does a $1,999 Mac mini outrun a $4,000 Windows workstation for local AI agents? Apple Silicon's unified memory changes the math. A practical hardware guide for 2026.

BuySellRam

A $1,999 Mac mini runs a 70B parameter model that a $4,000 Windows workstation physically cannot.
The reason: Apple Silicon's unified memory. No separate VRAM pool. No PCIe bottleneck. Just one shared memory for CPU, GPU, and Neural Engine.
Full breakdown: https://www.buysellram.com/blog/why-mac-mini-is-the-surprising-frontrunner-for-local-ai-agents/

#ArtificialIntelligence #AI #LocalAI #MacMini #AppleSilicon #LLM #AIAgents #MachineLearning #EdgeAI #TechInfrastructure #DataPrivacy #Automation #AIHardware

Why Mac mini Is the Surprising Frontrunner for Local AI Agents

Why does a $1,999 Mac mini outrun a $4,000 Windows workstation for local AI agents? Apple Silicon's unified memory changes the math. A practical hardware guide for 2026.

BuySellRam
A $1,999 Mac mini vs. a $4,000 Windows workstation for local AI — and the Mac wins. Here's why.
Most people running AI today are using cloud APIs — ChatGPT, Claude, Gemini. The model lives on someone else's server, and your device barely works up a sweat. That's still the dominant model, and it works well.
But a growing number of developers and businesses are experimenting with something different: running the language model locally, on their own hardware. Always-on, private, no per-token cost.
When you go down that path, the hardware question gets interesting fast.
On a conventional Windows workstation, the CPU and GPU have separate memory pools. A 70B parameter model needs ~40 GB just to load — more than most consumer GPUs have. So you either cap out at smaller models, or spend $6,000+ on a professional GPU card.
Apple Silicon does it differently. The M4 Pro chip puts the CPU, GPU, and Neural Engine on the same die, sharing one unified memory pool. A 48 GB Mac mini hands the full 48 GB to its GPU cores directly — no transfer, no bottleneck, no VRAM ceiling.
Result: a $1,999 Mac mini runs Llama 3.1 70B. A $3,500 Windows workstation with an RTX 4090 cannot.
We wrote up the full comparison — including where Windows workstations still win (CUDA toolchains, training, high-concurrency serving) and a practical hardware decision guide for 2026.
https://www.buysellram.com/blog/why-mac-mini-is-the-surprising-frontrunner-for-local-ai-agents/
#LocalAI #MacMini #AppleSilicon #AIAgents #LLM #AIInfrastructure #OpenSource #Ollama #AIHardware #TechForBusiness
Why Mac mini Is the Surprising Frontrunner for Local AI Agents

Why does a $1,999 Mac mini outrun a $4,000 Windows workstation for local AI agents? Apple Silicon's unified memory changes the math. A practical hardware guide for 2026.

BuySellRam