Sebastian Raschka (@rasbt)
Mamba-3์ด ์ถ์๋์์ผ๋ฉฐ, ์์ฑ์๋ Mamba ๋ฐ ์ ์ฌ ๋ชจ๋ธ๋ค์ด ํธ๋์คํฌ๋จธ ์ดํ ์ ํ์ด๋ธ๋ฆฌ๋ ์ํคํ ์ฒ(Qwen3.5, Kimi Linear ๋ฑ)์์ ํฅ๋ฏธ๋ก์ด ํ์ฉ์ฒ๋ผ๊ณ ํ๊ฐํฉ๋๋ค. ๋ค์ ์ธ๋ ํ์ด๋ธ๋ฆฌ๋์์ Gated DeltaNet ๋์ RoPE๊ฐ ์ถ๊ฐ๋ Mamba-3์ ๊ต์ฒดํด๋ณด๋ ์คํ์ ์ ์ํ๊ณ ์์ต๋๋ค.

Oh wow, Mamba-3 is here! For me, the most interesting use case of Mamba and Mamba-likes are the recent transformer attention hybrid architectures (Qwen3.5, Kimi Linear, etc.) Would be interesting to swap Gated DeltaNet with Mamba-3 (which now also has RoPE) in next gen hybrids.