Ivan Fioravanti ᯅ (@ivanfioravanti)

vllm-metal에서 PR 작업을 통해 M3 Ultra에서 최대 32K 컨텍스트로 Qwen3-0.6B 테스트를 실행한 내용이 공유됐다. mlx-lm이 여전히 더 낫다고 언급하며, TTFT는 적절한 지표가 아니고 개선 작업이 진행 중이라고 밝혔다. 히트맵과 Prefill/Decode 차트도 업데이트됐다.

https://x.com/ivanfioravanti/status/2036156678035419263

#vllm #mlxlm #qwen3 #llm #opensource

Ivan Fioravanti ᯅ (@ivanfioravanti) on X

I did my first PR on vllm-metal and I was able to run tests with up to 32K context on M3 Ultra with Qwen3-0.6B. mlx-lm is still the winner and TTFT is not the correct one there, improvement is WIP. Mega heatmap andPrefill/Decode charts updated. ctx rows leverages caching.

X (formerly Twitter)
Running a local #qwen3 model over #opencode produced the below vibe coding results - an Ansible reference. It's not much but it was honest work :) https://git.wtf.lt/simonas/ansible/src/branch/master/README.md
ansible/README.md at master

ansible

Gitea: Git with a cup of tea
Petal für #macOS: Simple Diktat-App für die Menüleiste mit #Qwen3, #Parakeet und #Voxtral. #OpenSource, ohne Cloud und Abo 🎙️🖥️🔓 https://tchgdns.de/?p=162724
HomeSec-Bench — Local AI vs Cloud Benchmark | SharpAI Aegis

Qwen3.5-9B scores 93.8% on 96 real security AI tests — within 4 points of GPT-5.4 — running entirely on Apple Silicon. Full benchmark results and methodology.

This week on #openSUSE Planet, find out about a new #Cockpit launcher that simplifies #sysadmin tasks, Cavil #Qwen3.5-4B brings #AI-powered #legal compliance, and #LogAI lets you query system logs for overnight answers. #Linux https://news.opensuse.org/2026/03/20/planet-roundup/
Planet News Roundup

This is a roundup of articles from the openSUSE community listed on planet.opensuse.org. The community blog feed aggregator lists the featured highlights bel...

openSUSE News

Sebastian Raschka (@rasbt)

Mamba-3이 출시되었으며, 작성자는 Mamba 및 유사 모델들이 트랜스포머 어텐션 하이브리드 아키텍처(Qwen3.5, Kimi Linear 등)에서 흥미로운 활용처라고 평가합니다. 다음 세대 하이브리드에서 Gated DeltaNet 대신 RoPE가 추가된 Mamba-3을 교체해보는 실험을 제안하고 있습니다.

https://x.com/rasbt/status/2034088726997893168

#mamba3 #transformer #qwen3.5 #gateddeltanet #rope

Sebastian Raschka (@rasbt) on X

Oh wow, Mamba-3 is here! For me, the most interesting use case of Mamba and Mamba-likes are the recent transformer attention hybrid architectures (Qwen3.5, Kimi Linear, etc.) Would be interesting to swap Gated DeltaNet with Mamba-3 (which now also has RoPE) in next gen hybrids.

X (formerly Twitter)

Prince Canuma (@Prince_Canuma)

mlx-embeddings v0.1.0 출시: 새 모델로 Alibaba의 Qwen3 VL Embedding 및 Reranker와 ColDefics3(LoRA 어댑터 및 ColVision 프로세서 포함)이 추가되었습니다. NVFP4·MXFP4·MXFP8 양자화 지원이 도입되었고, Gemma3의 양방향 모델 임베딩 품질 수정이 포함됩니다.

https://x.com/Prince_Canuma/status/2032890809847029896

#mlxembeddings #embeddings #qwen3 #coldefics3 #quantization

Prince Canuma (@Prince_Canuma) on X

mlx-embeddings v0.1.0 is out! 🔥 New models: → Qwen3 VL Embedding and Reranker by @Alibaba_Qwen → ColDefics3 with LoRA adapters & ColVision processor b New features: → NVFP4, MXFP4 and MXFP8 quantization support → Gemma3 embedding quality fix for bidirectional models →

X (formerly Twitter)
That's an excerpt of #Qwen3.5 35B summary on Robin Wright after I made it correct everything :
"Robin Wright is an acclaimed American actress, producer, and director. Born in 1966, she gained early fame as Princess Buttercup in The Princess Bride (1987) and earned a Golden Globe nomination for Forrest Gump (1994). She achieved global acclaim as Claire Underwood in the #Netflix series House of Cards (2013–2018), winning a Golden Globe ..""
Not a word on Santa Barbara #TV show 🤷‍♂️
#80s

New update for the slides of my talk "Run LLMs Locally":

Now including Reranking, Qwen 3.5 (slower than Qwen 3, but includes Vision) and loading models with Direct I/O.

https://codeberg.org/thbley/talks/raw/branch/main/Run_LLMs_Locally_2025_ThomasBley.pdf

#llm #llamacpp #ollama #stablediffusion #gptoss #qwen3 #glm #opencode #localai #mcp

新舊MBP 本地LLM對決〡M2 Max vs M5 Max〡極限測試大型LLM
當 2026 年最強 Apple 流動平台 M5 Max  MBP 遇上 Local 部署的  Qwen 3. […]
#unwire TV #AI #Apple #M2 Max
https://unwire.hk/2026/03/13/m5-max-macbook-pro/unwire_podcast/?utm_source=rss&utm_medium=rss&utm_campaign=m5-max-macbook-pro