Sebastian Raschka (@rasbt)

Mamba-3์ด ์ถœ์‹œ๋˜์—ˆ์œผ๋ฉฐ, ์ž‘์„ฑ์ž๋Š” Mamba ๋ฐ ์œ ์‚ฌ ๋ชจ๋ธ๋“ค์ด ํŠธ๋žœ์Šคํฌ๋จธ ์–ดํ…์…˜ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์•„ํ‚คํ…์ฒ˜(Qwen3.5, Kimi Linear ๋“ฑ)์—์„œ ํฅ๋ฏธ๋กœ์šด ํ™œ์šฉ์ฒ˜๋ผ๊ณ  ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ์„ธ๋Œ€ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ์—์„œ Gated DeltaNet ๋Œ€์‹  RoPE๊ฐ€ ์ถ”๊ฐ€๋œ Mamba-3์„ ๊ต์ฒดํ•ด๋ณด๋Š” ์‹คํ—˜์„ ์ œ์•ˆํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

https://x.com/rasbt/status/2034088726997893168

#mamba3 #transformer #qwen3.5 #gateddeltanet #rope

Sebastian Raschka (@rasbt) on X

Oh wow, Mamba-3 is here! For me, the most interesting use case of Mamba and Mamba-likes are the recent transformer attention hybrid architectures (Qwen3.5, Kimi Linear, etc.) Would be interesting to swap Gated DeltaNet with Mamba-3 (which now also has RoPE) in next gen hybrids.

X (formerly Twitter)