Mastodawn

Coding plan comparisons based on actual usage

이 글은 2026년 5월 1일 기준으로 여러 AI 코딩 플랜의 실제 사용 데이터를 바탕으로 가격과 성능을 비교 분석한다. Kimi 2.6, MiniMax 2.7, GLM 5.1, Codex(GPT-5.5), Claude Pro(Opus 4.7) 등 다양한 모델의 구독료 대비 토큰 비용과 처리 속도를 상세히 제시하며, 특히 코딩 플랜이 가장 경제적인 접근법임을 강조한다. 또한 각 모델의 응답 시간과 TPS(초당 처리량)를 비교해 사용자 경험 측면에서 Claude Pro가 빠르고 직관적임을 언급한다. 다만 일부 중국계 모델은 API 사용 시 계정 정지 위험이 있을 수 있음을 경고한다.

https://sites.diy/blog/2026-05-01-coding-plan-comparisons/

#llm #codingplan #subscription #modelbenchmark #apiusage

Coding plan comparisons based on actual usage — sites.diy

Measuring AI coding plans vs API pricing. Codex is subsidized ~27×, most others ~8×, and Claude Pro still costs ~10× more per token than the rest.

sites.diy

sayzard Mar 15

CanIRun.ai는 브라우저의 WebGPU로 내 PC·노트북에서 실행 가능한 AI 모델을 추정해 보여주는 웹 도구입니다. 모델별 메모리 요구량·토큰 속도·컨텍스트 길이와 S~F 등급을 제공해 Qwen, Llama, Gemma, Mistral, GPT‑OSS 등 주요 모델의 로컬 실행 가능성을 빠르게 판단하게 해주나, 결과는 추정치이며 MoE·양자화·모바일 인식 등 정확도 개선 요구가 있습니다.

https://news.hada.io/topic?id=27483

#canirun.ai #localllm #webgpu #modelbenchmark #qwen

CanIRun.ai — 내 컴퓨터에서 AI 모델을 실행할 수 있을까?

<ul> <li>로컬 머신이 어떤 <strong>AI 모델을 실제로 실행할 수 있는지</strong>를 확인할 수 있는 웹 기반 도구</li> <li>브라우저의 <strong>We...

GeekNews

sayzard Feb 24

Ravi Sharma (@ravishar313)

@AnthropicAI, @OpenAI, @Google, @Zai_org, @MiniMax_AI, @Kimi_Moonshot 등의 모델을 단백질-리간드 결합 부위를 출판 수준으로 시각화할 수 있는지 테스트한 결과, Anthropic의 모델이 가장 우수하고 Google의 Gemini 모델이 가장 저조한 성능을 보였다는 실험 결과를 공유한 트윗입니다. 이는 생명과학, 약물 설계 등 AI 기반 연구 응용 가능성을 보여주는 흥미로운 비교입니다.

https://x.com/ravishar313/status/2025945722324160549

#ai #modelbenchmark #protein #anthropic #gemini

Ravi Sharma (@ravishar313) on X

I tested models from @AnthropicAI @OpenAI @Google @Zai_org @MiniMax_AI and @Kimi_Moonshot on whether they can create a publication-level view of a protein-ligand binding site and the results were surprising. TL;DR: Anthropic models did the best and Gemini models did the worst.

X (formerly Twitter)

sayzard Feb 24

Ravi Sharma (@ravishar313)

@AnthropicAI, @OpenAI, @Google, @Zai_org, @MiniMax_AI, @Kimi_Moonshot의 모델들을 단백질-리간드 결합 부위 시각화 생성 능력으로 비교한 결과, Anthropic 모델이 가장 우수하고 Google의 Gemini 모델이 가장 낮은 성능을 보였다는 실험 결과를 공유한 트윗입니다.

https://x.com/ravishar313/status/2025945722324160549

#ai #modelbenchmark #anthropic #openai #gemini

Ravi Sharma (@ravishar313) on X

X (formerly Twitter)

sayzard Feb 23

Ravi Sharma (@ravishar313)

Anthropic, OpenAI, Google, Zai, MiniMax, Kimi_Moonshot 등의 모델을 단백질-리간드 결합 부위의 출판 수준 시각화를 생성하는 능력으로 평가한 결과, Anthropic 모델이 가장 우수했고 Google의 Gemini 모델이 가장 낮은 성능을 보였습니다.

https://x.com/ravishar313/status/2025945722324160549

#modelbenchmark #anthropic #openai #gemini #airesearch

Ravi Sharma (@ravishar313) on X

X (formerly Twitter)

sayzard Feb 16

AshutoshShrivastava (@ai_for_success)

Kilo Code가 OpenRouter 일일 리더보드에서 OpenClaw를 제치고 다시 1위를 차지했다는 소식입니다. 트윗에 따르면 어제 총 313B 토큰이 처리되었고 그중 222B가 GLM-5를 통해 라우팅되었다고 보고되며, GLM-5의 성능과 시장 점유에 대한 관심을 환기시키고 있습니다.

https://x.com/ai_for_success/status/2023358351892246569

#openrouter #glm5 #modelbenchmark #routing

AshutoshShrivastava (@ai_for_success) on X

Kilo Code has taken back the #1 spot on the OpenRouter daily leaderboard, overtaking OpenClaw. 313B tokens processed yesterday, with 222B routed through GLM-5 alone. wt* How good is GLM-5, really?

X (formerly Twitter)

sayzard Feb 13

LLM Stats (@LlmStats)

Step-3.5-Flash(또는 StepFun의 모델)가 LiveCodeBench V6에서 0.864로 1위를 기록하며 Kimi K2.5(0.85), GLM-4.7(0.849), GPT OSS 120B(0.819) 등을 제치고 최상위 성능을 보였습니다. LiveCodeBench V6는 실제 경쟁 프로그래밍 플랫폼의 최신 문제로 모델을 평가하는 벤치마크입니다.

https://x.com/LlmStats/status/2022377816302510189

#livecodebench #codeeval #llm #modelbenchmark

LLM Stats (@LlmStats) on X

Step-3.5-Flash (@StepFun_ai) tops LiveCodeBench V6 with 0.864 #1 out of all models, ahead of Kimi K2.5 (0.85), GLM-4.7 (0.849), and GPT OSS 120B (0.819). LiveCodeBench V6 tests models on fresh, real-world coding problems from competitive programming platforms. Step-3.5-Flash

X (formerly Twitter)

sayzard Feb 5

Iaiso (@laiso)

Qwen code와 qwen3-coder-next 조합이 GLM 4.7 수준의 결과를 보였다는 보고(laiso/ts-bench PR). 해당 모델은 상위권 모델들 중 셀프호스트 요건이 가장 낮고, 외부 제공자 가격도 k2.5 대비 약 절반 수준으로 관측되어 성능 대비 호스팅 비용·자원 효율 측면에서 주목받고 있습니다.

https://x.com/laiso/status/2019344615544090941

#qwen #qwen3 #glm #selfhosting #modelbenchmark

Iaiso (@laiso) on X

Qwen codeとqwen3-coder-nextの組み合わせでGLM 4.7並の結果が出るようになっていた https://t.co/DVcyoEg3AZ このモデルは現行トップ層の中でも一番セルフホスト要件が低い外部プロバイダーでもk2.5の1/2程度の相場だった

X (formerly Twitter)