Mastodawn

Mein #arbeitgeber labert grade in so nem #MicrosoftTeams Call für alle Mitarbeiter was von #digitalesouveranitat und dann soll ich MEHR mit #microsoft #github #copilot machen. Und selbstverständlich wird ALLES #ai. Sogar unsere TLD wechselt von .net auf .ai.
Wir sollen ganz explizit doch bitte #ki in die tägliche #Arbeit einbinden, der Vertrieb von KI schwärmen. Aber bloß "unsere" nutzen, wegen den Daten. Muss mich gleich mal informieren, ob #localLLMs erlaubt sind.
https://hessen.social/@Moonstone2487/116082676681677166

Hacker News May 2

Mini PC for local LLMs in 2026

https://terminalbytes.com/best-mini-pc-for-local-llm-2026/

#HackerNews #MiniPC #LocalLLMs #2026 #Technology #AI

Best mini PC for local LLMs in 2026 (Strix Halo era) | TerminalBytes

Strix Halo mini PCs doubled in price in six months. Here's what's worth buying for local LLMs in 2026, what to skip, and the 120W gotcha nobody mentions.

sammii Apr 28

Switched qwen3.5:4b from cloud to the Mac Mini and cut Spellcast API calls by ninety percent. Tinyvision tasks that once bled credits now run locally in seconds.

#SelfHosting #LocalLLMs #TinyVision #APIoptimization

Hacker News Apr 27

Running Local LLMs Offline on a Ten-Hour Flight

https://deploy.live/blog/running-local-llms-offline-on-a-ten-hour-flight/

#HackerNews #LocalLLMs #Offline #Flight #Technology #AIApplications #TravelTech

Running Local LLMs Offline on a Ten-Hour Flight

I flew from London to Google Cloud Next 2026 in Las Vegas. Ten hours with no in-flight wifi. I used the time to test how far a modern MacBook can carry engineering work on local LLMs alone. Setup A week old MacBook Pro M5 Max, 128GB unified memory, 40-core GPU. Gemma 4 31B and Qwen 4.6 36B via LM Studio. Top 100 most common docker images, top programming languages alongside with enough dependencies to build function sites with rich visualisations.

Dmitri Lerko

Nicolas Fränkel 🇪🇺🇺🇦🇬🇪Apr 18

How to Run #LocalLLMs with #ClaudeCode

https://unsloth.ai/docs/basics/claude-code

How to Run Local LLMs with Claude Code | Unsloth Documentation

Guide to use open models with Claude Code on your local device.

sayzard Apr 17

Paul (@paulyoung)

macOS와 Linux 간 연동을 성공시킨 뒤 모델을 로드할 준비를 하고 있다는 내용입니다. Exolabs를 이용한 이기종 시스템 연결과 로컬 모델 실행 환경 구축 흐름을 보여주는 짧은 업데이트입니다.

https://x.com/paulyoung/status/2044765172930322847

#macos #linux #localllms #exolabs #modelserving

Paul 🇦🇺 (@paulyoung) on X

Finally got MacOS and Linux to talk to each other. Now to load some models....@exolabs

X (formerly Twitter)

sayzard Apr 17

Peter Corbett (@corbett3000)

Exolabs와 M5, Mac mini M3 조합으로 Qwen3.5-35B-A3B-4B를 로컬 실행해 48.6 tok/s 성능을 확인한 사용기입니다. RDMA는 아직 없지만, 로컬 LLM 환경에서 애플 실리콘 기반 멀티 디바이스 추론 성능과 exolabs 활용 가능성을 보여줍니다.

https://x.com/corbett3000/status/2044838754335223824

#qwen #localllms #macos #m5 #exolabs

Peter Corbett (@corbett3000) on X

It's @exolabs day! Trying it out with my M5 and mac mini m3. Getting 48.6 tok/s with Qwen3.5-35B-A3B-4B. Is this good? #qwen #localllms No RDMA yet as I need a new cable.

X (formerly Twitter)

Daniel S. Reichenbach Apr 13

Just a note for parallel universe me: flash attention is bad for #LocalLLMs

sayzard Apr 13

whatcani.run의 실사용 데이터(22,914,944 토큰·4,479회·191명)를 바탕으로 M1 Max(64GB)에서 로컬로 돌릴 수 있는 모델 성능을 정리했습니다. llama.cpp·mlx_lm 등으로 측정한 결과, 1B–4B급 모델은 메모리 0.6–4.6GB로 'runs great/well', 4–13GB대 모델은 'runs well/ok', 20–26B급(예: gpt-oss-20b, Gemma 26B)은 11–13GB로 간헐적 실행. Qwen 계열과 Liquid AI 모델이 소형 환경에서 특히 우수했습니다.

https://www.whatcani.run/

#macos #m1max #localllms #benchmarks #qwen