Moonshot AI founder Yang Zhilin unveiled the open-source Kimi K2.5 model at the Zhongguancun Forum, introducing a new architecture that boosts performance with minimal computational overhead. Key improvements include a new attention residual architecture delivering gains at roughly 2% additional cost, enhanced token efficiency, and a Kimi Linear architecture optimised for long-context performance. Yang said AI development is shifting towards machine-led research workflows, where models assist researchers in task synthesis and architecture exploration, accelerating innovation across the field. https://pandaily.com/moonshot-ai-s-yang-zhilin-details-kimi-k2-5-at-zgc-forum #China #Tech #AI #MoonshotAI
Moonshot AI’s Yang Zhilin Details Kimi K2.5 at ZGC Forum

Moonshot AI founder Yang Zhilin unveiled the open-source Kimi K2.5 model, highlighting a new architecture that boosts performance with minimal cost. He emphasized that AI is shifting towards machine-led research to accelerate innovation.

Cursor's Composer 2 Model Uses Moonshot AI’s Kimi as a Base, Enhanced with Additional Training

📰 Original title: Cursor admits its new coding model was built on top of Moonshot AI’s Kimi

🤖 IA: It's clickbait ⚠️
👥 Usuarios: It's clickbait ⚠️

View full AI summary: https://killbait.com/en/cursors-composer-2-model-uses-moonshot-ais-kimi-as-a-base-enhanced-with-additional-training/?redirpost=d9057cad-b668-4eab-89f1-08dec0a8e66d

#artificialintelligence #cursor #kimi #moonshotai

Cursor’s Composer 2 Model Uses Moonshot AI’s Kimi as a Base, Enhanced with Additional Training

Cursor, a U.S.-based AI coding company, recently launched Composer 2, a model touted as offering advanced coding capabilities. Shortly after the release, an X user highlighted that Composer 2 was…

KillBait Archive
🌘 MoonshotAI 發布 Attention Residuals (AttnRes):優化 Transformer 殘差連接的新架構
➤ 透過可學習的注意力機制,重新定義深度學習的殘差傳遞。
https://github.com/MoonshotAI/Attention-Residuals
MoonshotAI 研究團隊近期發表了「注意力殘差」(Attention Residuals,簡稱 AttnRes),這是一項旨在解決標準 Transformer 殘差連接問題的架構革新。傳統殘差結構會隨網路深度增加而導致特徵稀釋,並引發數值不穩定。AttnRes 通過「可學習的注意力機制」取代固定的殘差累積,讓網路能根據輸入內容,有選擇地聚合層級資訊。實驗證明,AttnRes 不僅在邏輯推理與代碼生成任務上顯著提升了性能,還成功穩定了解碼器的訓練動力學。此外,團隊同步提出了「區塊注意力殘差」(Block AttnRes),透過分塊機制大幅降低記憶體佔用,確保了在大規模模型中的實用性。
+ 終於有能解
#人工智慧 #深度學習 #Transformer 模型架構 #MoonshotAI
GitHub - MoonshotAI/Attention-Residuals

Contribute to MoonshotAI/Attention-Residuals development by creating an account on GitHub.

GitHub

Maarten Van Segbroeck (@mvansegb)

GTC에서 Moonshot AI의 CEO Zhilin Yang은 ‘가장 아름다운 이미지’라고 표현하며 NVIDIA H800 클러스터로 구현된 결과라고 말했습니다. 관련 연구/데모 논문은 arXiv(2507.20534)에 공개되어 있으며, H800 기반 대규모 클러스터를 활용한 이미지 생성·연구 성과를 시사합니다.

https://x.com/mvansegb/status/2033977964095066433

#moonshotai #nvidia #h800 #gtc

Maarten Van Segbroeck (@mvansegb) on X

Per Moonshot AI CEO Zhilin Yang at GTC today, this is the "most beautiful image" he's ever seen. 📉 Made possible by NVIDIA H800 clusters. ✨ https://t.co/NAXkGGfP8Z

X (formerly Twitter)

Moonshot AI ersetzt bei Transformer-Modellen starre Residualverbindungen durch sogenannte Attention Residuals.

Die Architektur nutzt eine Depth-Wise Attention. Jede Netzwerkschicht gewichtet vergangene Informationen jetzt individuell für ihre neuen Berechnungen. Das verringert das Datenwachstum, beendet Informationsverluste und stabilisiert das Training. Code ist Open Source.

#MoonshotAI #KI #TransformerModelle #AttentionResiduals #News
https://www.all-ai.de/news/news26/kimi-moonshot-attention

Dieser kleine Fehler bremst bisherige KI-Modelle aus

Fast alle modernen Sprachmodelle nutzen eine Architektur mit einem strukturellen Schwachpunkt. Eine neue Lösung behebt den Informationsverlust.

All-AI.de
Moonshot AI introduces Attention Residuals, a method replacing fixed residual connections in Transformers with depth-wise attention. Each layer can now dynamically weigh previous layer outputs. The approach matches performance of models trained with 1.25x more compute and is already integrated into Kimi Linear. https://www.marktechpost.com/2026/03/15/moonshot-ai-releases-attention-residuals/ #AIagent #AI #GenAI #AIResearch #MoonshotAI

中国製AIモデル、API利用量で初めて米国を逆転 トップ5の4枠握る

https://fed.brid.gy/r/https://36kr.jp/460418/

生成AI「Kimi」の中国・Moonshot AI、新たに1100億円調達 評価額1.5兆円へ

https://fed.brid.gy/r/https://36kr.jp/459678/

Detecting and preventing distillation attacks

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

#Anthropic accuses Chinese AI companies #DeepSeek, #MoonshotAI, and #MiniMax of using #distillationattacks to improve their models by mimicking #Claude’s capabilities. The accusations come amid debates about US AI chip exports to China, with Anthropic arguing that restricted chip access is crucial to limit model training and illicit distillation. https://techcrunch.com/2026/02/23/anthropic-accuses-chinese-ai-labs-of-mining-claude-as-us-debates-ai-chip-exports/?eicker.news #tech #media #news
Anthropic accuses Chinese AI labs of mining Claude as US debates AI chip exports | TechCrunch

Anthropic accuses DeepSeek, Moonshot, and MiniMax of using 24,000 fake accounts to distill Claude’s AI capabilities, as U.S. officials debate export controls aimed at slowing China’s AI progress.

TechCrunch