Mastodawn

Bindu Reddy (@bindureddy)

프런티어 AI 랩의 초대형 모델 ‘INSANELY-BIG’가 공개되지 않을 수 있다는 소식을 전하며, 대부분의 벤치마크에서 매우 높은 점수를 기록했다고 주장했다. 특히 SWE-Bench Pro 성능과 높은 추론 비용이 언급되어 차세대 대형 모델의 성능·비용 이슈를 보여준다.

https://x.com/bindureddy/status/2042001592027877708

#frontiermodel #benchmark #swebench #llm #ai

Bindu Reddy (@bindureddy) on X

INSANELY-BIG is a very large model from a frontier AI lab It scores 99 on literally EVERY SINGLE EVAL except SWE Bench Pro On SWE-BENCH pro it score 99.99 They say they won’t release it - cause it costs $100 per output 1M tokens and yeah - it’s too scary to drop to the

X (formerly Twitter)