Claude Sonnet 4.5, AI 코딩 모델의 새로운 챔피언

Anthropic의 Claude Sonnet 4.5가 SWE-bench에서 70.6%를 기록하며 GPT-5를 제치고 1위에 올랐다. 30시간 이상 자율 코딩이 가능하며 다양한 산업에서 실질적 성과를 보이고 있다.

https://aisparkup.com/posts/5213

🚀 #Anthropic announces major updates to their #AI model lineup:

💻 Upgraded #Claude35Sonnet shows significant improvements:
• Achieves 49% on #SWEbench Verified coding benchmark
• Leads in software engineering capabilities
• Maintains same price and speed as predecessor
• Tested by US and UK #AI Safety Institutes

🔄 New #Claude35Haiku introduction:
• Matches #Claude3Opus performance at lower cost
• Scores 40.6% on SWEbench Verified
• Optimized for user-facing products
• Available across multiple cloud platforms

🖱️ Pioneering #ComputerUse beta feature:
• Allows AI to navigate interfaces like humans
• Scores 22% on #OSWorld benchmark
• Currently in experimental phase
• Supported by new safety classifiers

⚡ Enterprise adoption:
#GitLab reports 10% improvement in DevSecOps tasks
#Replit leverages computer use for app evaluation
#Cognition notes enhanced problem-solving capabilities

https://www.anthropic.com/news/3-5-models-and-computer-use

Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku

A refreshed, more powerful Claude 3.5 Sonnet, Claude 3.5 Haiku, and a new experimental AI capability: computer use.