Rohan Paul (@rohanpaul_ai)

Qwen 3.7 Max가 코딩과 에이전트 능력에서 프런티어 모델에 매우 근접하다고 평가됐고, AI/ML API에서도 사용 가능해졌습니다. 에이전트 신뢰성이 핵심 포인트이며, Artificial Analysis 기준으로 5위로 GPT-5.4(xhigh)와 거의 비슷한 수준으로 언급됩니다.

https://x.com/rohanpaul_ai/status/2057541420073062575

#qwen #coding #agents #llm #benchmark

Rohan Paul (@rohanpaul_ai) on X

Qwen 3.7 Max is super close to the frontier models for coding and agentic abilities. And and it’s now available on AI/ML API. Agent reliability the center of the story and also on Artificial Analysis it's sitting at 5th, pretty much on par with GPT 5.4 (xhigh) and a notch above

X (formerly Twitter)

Как мы в отделе документации создали LLM агента для автоматизированного перевода с английского на другие языки

Разбираем, как в отделе документации построили LLM-агента для автоматизированного перевода Markdown-документации. Архитектура, пайплайн, валидация, работа с Ollama, OpenWebUI и Qwen, плюсы и ограничения подхода.

https://habr.com/ru/companies/hostkey/articles/1037646/

#LLM #автоматизация_перевода #техническая_документация #Python #валидация #Markdown #OpenWebUI #Qwen #оркестрация #hostkey

Как мы в отделе документации создали LLM агента для автоматизированного перевода с английского на другие языки

Автор: Александр Казанцев, руководитель отдела документации и контента Представьте, что вы поддерживаете крупный проект с документацией на нескольких языках. Каждый раз, когда в английской версии...

Хабр

AshutoshShrivastava (@ai_for_success)

로컬 추론에서 Multi Token Prediction(MTP)을 적용해 Qwen 3.6 27B의 속도를 2.5배 높였다는 사례입니다. atomic.chat이 2x RTX 5090 환경에서 패치된 LLaMA.cpp로 여러 토큰을 미리 예측해 성능을 끌어올렸으며, 클라우드나 API 없이 로컬 인퍼런스 최적화에 직접적인 참고가 됩니다.

https://x.com/ai_for_success/status/2057304869489594771

#mtp #llamacpp #qwen #localinference #rtx5090

AshutoshShrivastava (@ai_for_success) on X

MTP is making a big difference atomic[.]chat just hit 2.5x speed on Qwen 3.6 27B locally using Multi Token Prediction (MTP). no cloud. no API. just raw local inference getting faster. running on 2x RTX 5090, patched LLaMA[.]cpp with MTP drafting several tokens ahead and

X (formerly Twitter)

RT @loktar00: Mein Zeitplan heute besteht aus Karte um Karte, die mit MTP aufleuchten.... 4070 Ti Super mit 86 t/s, 5090 bei 110, 7900 XTX summend auf Vulkan, alle laufen mit Qwen 3.6 27B oder 35B-A3B, seit llama.cpp den PR gemerged hat.

mehr auf Arint.info

#AI #Hardware #LLM #Performance #Qwen #Software #arint_info

https://x.com/loktar00/status/2056384296319930717#m

Arint - SEO+KI (@[email protected])

<p>RT @loktar00: Mein Zeitplan heute besteht aus Karte um Karte, die mit MTP aufleuchten.... 4070 Ti Super mit 86 t/s, 5090 bei 110, 7900 XTX summend auf Vulkan, alle laufen mit Qwen 3.6 27B oder 35B-A3B, seit llama.cpp den PR gemerged hat.</p> <p><a href="https://arint.info/@Arint/116602077926968324">mehr</a> auf <a href="https://arint.info/">Arint.info</a></p> <p>#AI #Hardware #LLM #Performance #Qwen #Software #arint_info</p> <p><a href="https://x.com/loktar00/status/2056384296319930717#m">https://x.com/loktar00/status/2056384296319930717#m</a></p>

Mastodon Glitch Edition

NobodyWho now supports #Swift 🎉

Run #LLMs fully on-device in your #iOS, #macOS, #watchOS & #visionOS apps. No internet. No API keys. No usage fees.

#Gemma4, #Qwen & more (.gguf)
→ Hardware acceleration
→ Tool calling, #RAG, vision & audio ingestion
→ Open-source & free

https://github.com/nobodywho-ooo/nobodywho

I wonder if using a "dumber" local AI model might help mitigate the cognitive decline some researchers are starting to observe.

I am currently running Qwen 3.6 26B via Llama.cpp and it's obviously not as good as Claude or Gemini. It requires some hand-holding. But that doesn't mean it's useless. It can still provide insights, but it's up to you to steer it to get you the results you desire.

#llm #ai #qwen #localai

Censura en Qwen: el circuito que oculta Tiananmen

¿Cómo oculta Qwen información sobre Tiananmen? Un estudio reveló el circuito exacto de censura política IA en sus pesos, y cómo (casi) desactivarlo.

https://blog.donweb.com/censura-politica-ia-modelos-lenguaje-qwen/

#qwen #censuraia #interpretabilidadmecanistica #modelosdelenguaje #alianzaiachina

Censura política IA: cómo Qwen la oculta en sus pesos

¿Cómo oculta Qwen información sobre Tiananmen? Un estudio reveló el circuito exacto de censura política IA en sus pesos, y cómo (casi) desactivarlo.

Blog Donweb

金のニワトリ (@gosrum)

Qwen3.7이 공개됐다는 언급입니다. 구체적 성능이나 변경점은 없지만, Qwen 계열 최신 모델 소식으로 로컬/오픈 가중치 LLM 흐름을 추적하는 개발자에게는 참고할 만한 업데이트입니다.

https://x.com/gosrum/status/2056507655422923086

#qwen #llm #modelrelease #openweights #ai

金のニワトリ (@gosrum) on X

まだローカルではないけど、Qwen3.7が来ましたね https://t.co/x9LH0wtkGc

X (formerly Twitter)

qwant news | Claude is still the best agentic coding tool, but Anthropic's tightening grip is the best argument yet for going local

AI generated summary, Read the full article for complete information.

Claude remains the premier agentic coding tool, yet Anthropic’s recent rollout of tighter limits—weekly usage caps, extended data‑retention policies, reduced prompt‑cache lifetimes, hidden server‑side setting changes, and the disabling of third‑party harnesses—has turned its cloud‑based service into a moving target that developers can’t control. This volatility highlights a fundamental risk of relying on hosted AI: the provider can alter access, pricing, and capabilities at any time. At the same time, open‑weight models such as Gemma 4, Alibaba’s Qwen 3.6‑27B/35B, and Zyphra’s ZAYA1‑8B have matured to the point where they run on consumer‑grade hardware and deliver competitive performance for many coding tasks, albeit with higher upfront hardware costs and setup effort. While Claude will likely stay the best‑performing option for complex, multi‑step work, the growing capabilities of local models make them a practical, stable alternative for everyday development, underscoring the argument for moving toward self‑hosted AI.

Read more: https://www.xda-developers.com/claude-still-best-agentic-coding-tool-anthropic-tightening-best-argument-local/

#Claude #Anthropic #Gemma #Qwen #GPU

Claude is still the best agentic coding tool, but Anthropic's tightening grip is the best argument yet for going local

Anthropic has been tightening its grip on Claude for a long time now, and local models are finally getting good.

XDA

RT @KuittinenPetri: Viele Leute sagen, dass die Nvidia DGX Spark zu langsam und das Geld nicht wert ist. Ich erziele mit meiner ASUS Ascend GX10 verrückte Geschwindigkeiten mit qwen3.5-35b-a3b-nvfp4: über 200 Token/s, 495k Prefill. In der realen Leistung ist sie niedriger, aber immer noch über 100 Token/s. sparkrun run @atlas/qwen3.5-35b-a3b-nvfp4

mehr auf Arint.info

#AI #ASUS #GPU #Nvidia #Performance #Qwen #arint_info

https://x.com/KuittinenPetri/status/2056273965828448284#m

Arint - SEO+KI (@[email protected])

<p>RT @KuittinenPetri: Viele Leute sagen, dass die Nvidia DGX Spark zu langsam und das Geld nicht wert ist. Ich erziele mit meiner ASUS Ascend GX10 verrückte Geschwindigkeiten mit qwen3.5-35b-a3b-nvfp4: über 200 Token/s, 495k Prefill. In der realen Leistung ist sie niedriger, aber immer noch über 100 Token/s. sparkrun run @atlas/qwen3.5-35b-a3b-nvfp4</p> <p><a href="https://arint.info/@Arint/116596415421848025">mehr</a> auf <a href="https://arint.info/">Arint.info</a></p> <p>#AI #ASUS #GPU #Nvidia #Performance #Qwen #arint_info</p> <p><a href="https://x.com/KuittinenPetri/status/2056273965828448284#m">https://x.com/KuittinenPetri/status/2056273965828448284#m</a></p>

Mastodon Glitch Edition