Mastodawn

It's amusing that #Qwen35 has been particularly sensitive to dates set in "the future" of it's training set. It even called a bunch of recent MCU movies referenced in one particular article as "imaginary" and "just for fun". 😏

#llm #ai

Nicolas MØUART Mar 15

That's an answer I would have never thought about in a million year, like the nose on the face.

> Why is Monday so far from Friday, and Friday so close to Monday?
🤖 > Emotional weight

#weights #AI #Qwen35 #working #anxiety

36Kr Japan | 最大級の中国テック・スタートアップ専門メディア Mar 12

アリババ、「Qwen 3.5」軽量AIモデル4種を公開　イーロン・マスク氏も「驚異的」と称賛

https://fed.brid.gy/r/https://36kr.jp/461736/

H@R0👨🏻‍💻Mar 12

似乎 27b 是速度和性能都最均衡的一個模型，可以在24G VRAM的顯卡上跑

https://www.reddit.com/r/LocalLLaMA/comments/1ro7xve/qwen35_family_comparison_on_shared_benchmarks/

#Qwen35

Maurizio Lo Nobile Mar 9

Dopo tutto questo hype per il rilascio di Qwen 3.5 ho fatto un test: sviluppare una POC per un cliente nel l'ambito "log collection".
Ve la faccio breve: gli ho fatto produrre un documento .md che raccoglie tutta la POC e poi l'ho testato.

Esito:
- parecchi errori
- ordini ignorati
- inventa comandi nonostante la lettura della doc ufficiale
- centinaia di reiterazioni

IMHO girerà su tutto "come dice qualcuno" ma perdo troppo tempo a correggerlo continuamente.

#qwen35 #ia #logcollection #uno

H@R0👨🏻‍💻Mar 5

好明顯今次release的Qwen3.5是有針對特定的硬件，例如8G VRAM 的顯卡，Amd Strix Halo CPU, Apple M chips

#qwen35

Firethering Mar 5

On X , people saw a five word post from Junyang Lin, the man who built Qwen from the ground up: “bye my beloved qwen.”

That was it. No explanation, just a goodbye.

Within hours the replies were flooding in. Developers, researchers, open source contributors all asking the same thing, what just happened?

#qwen35 #alibaba #ai #qwen

https://firethering.com/qwen-core-team-resigned-future-of-qwen-models/

Just After Launching Qwen3.5, Qwen's Core Team Walked Out. Is This the Last Great Qwen Model?

Yesterday I was testing Qwen3.5-4B on my machine, genuinely impressed by what a 4B model was doing with images and reasoning. Then I opened X and saw a five word post from Junyang Lin, the man who built Qwen from the ground up: "bye my beloved qwen." That was it. No explanation, no drama, just a goodbye. Within hours the replies were flooding in. Developers, researchers, open source contributors all asking the same thing — what just happened? And then Elon Musk's comment on Qwen3.5 calling it "impressive intelligence density" surfaced, and Lin replied with a simple "thx elon." People in the comments started connecting the dots — was he already gone when he replied? Did he know? Nobody is quite sure what to make of that exchange but it made the whole thing feel even stranger. Lin wasn't alone. Yu Bowen, who led post-training for Qwen, resigned the same day. Hui Binyuan, a core contributor focused on coding, had already left in January. Three of the most important people behind one of the best open source AI model families in the world, gone within months of each other. I had just tested the model. I had just written about why it was worth your attention. And now the people who built it had walked out.

Firethering

Firethering Mar 4

https://firethering.com/qwen3-5-4b-local-ai-model/
Alibaba just dropped #Qwen35 and the 4B version is the one worth paying attention to. It thinks before it answers, reads images and video, handles 201 languages, and sits on a context window of 262,144 tokens, longer than most models ten times its size. #opensource

Qwen3.5-4B: The Small AI Model That Thinks, Sees, and Runs on Your Machine

Most small AI models are a compromise. You give up reasoning for size, or vision for speed. Qwen3.5-4B doesn't seem to have gotten that memo. Alibaba just dropped Qwen3.5, and the 4B version is the one worth paying attention to. It thinks before it answers, reads images and video, handles 201 languages, and sits on a context window of 262,144 tokens, longer than most models ten times its size. All of that in something small enough to run on your own machine. I went through the benchmarks and architecture docs so you don't have to. Here's what actually matters.

Firethering

Habr Mar 4

Из коробки не работает: запускаем свежие большие LLM

В последнее время открытых моделей сверхбольшого размера развелось неимоверное количество, даже не просто моделей, а производителей. Вариации GLM, Kimi, DeepSeek занимают по нескольку строк в топ 5-10-20. Понадобилось перебрать основные LLM для тестов и выбора "рабочей лошадки", для чего пришлось немного пошуршать в интернетах. Оставлю в качестве памятки, вдруг кому-то окажется полезным. Всё делалось на базе образов vllm-openai, платформ B200/H200 и дров 590.48.01. На момент начала экспериментов - примерно пару недель тому назад - версии vllm 0.16 ещё не было, но, как выяснилось в итоге, это не сильно повлияло на ситуацию. Основные костыли остались теми же самыми. Разве что кастомизация образа не для каждой модели нужна теперь. В целом там, понятное дело, никакого RocketScience нету (особенно после того, как почитаешь китайские форумы в поисках нюансов). Но если бы кто-то посидел заранее и собрал советы в одном месте - жизнь была бы немного проще )) поэтому делюсь. Итак, поехали.

https://habr.com/ru/articles/1006202/

#KimiK25 #DeepSeekv32 #GLM5 #Qwen35 #vllm #B200 #H200

Из коробки не работает: запускаем свежие большие LLM

Хабр

H@R0👨🏻‍💻Mar 3

用了半天， Qwen3.5 9b的能力與以前 GPT-4o 接近，完全是可用的程式，重點是它能在8G VRAM的顯卡上運行，寫代碼寫文章通通無問題。 4b的話則與以往的本地部署llm類似，用是能用但要多跑幾次，每次要寫短一點的內容，逐步執行。我現在用4b去跑skills，或者只需輸出少量內容的工作。用9b當普通笨一點的AI來用。

#Qwen35