https://firethering.com/qwen3-5-4b-local-ai-model/
Alibaba just dropped #Qwen35 and the 4B version is the one worth paying attention to. It thinks before it answers, reads images and video, handles 201 languages, and sits on a context window of 262,144 tokens, longer than most models ten times its size. #opensource
Alibaba just dropped #Qwen35 and the 4B version is the one worth paying attention to. It thinks before it answers, reads images and video, handles 201 languages, and sits on a context window of 262,144 tokens, longer than most models ten times its size. #opensource

Qwen3.5-4B: The Small AI Model That Thinks, Sees, and Runs on Your Machine
Most small AI models are a compromise. You give up reasoning for size, or vision for speed. Qwen3.5-4B doesn't seem to have gotten that memo. Alibaba just dropped Qwen3.5, and the 4B version is the one worth paying attention to. It thinks before it answers, reads images and video, handles 201 languages, and sits on a context window of 262,144 tokens, longer than most models ten times its size. All of that in something small enough to run on your own machine. I went through the benchmarks and architecture docs so you don't have to. Here's what actually matters.