Mastodawn

🤯 I'm definitely getting value from the $60/month combined subs! I asked Codex to generate a script to scans my home folder and produce a usage report. It turns out I have used 496M tokens on Codex, 357M on Gemini, and 166M on Claude. That's over 1 billion tokens since last October, which averages roughly 56M/week or 8M/day. About 65% is cached input tokens though. This is only for agentic workflows not include usage on web or mobile apps. I haven't tried OpenClaw yet. #AI #LLM

Show thread

Pratik Patel Mar 22

@chikim I'm curious to see if that's better than just paying for Claude Max. I suppose it all depends on how much monthly use you get out of them combined.

Show thread

Chi Kim Mar 22

@ppatel Yeah maybe. Claude Max costs $100, and from what I hear, they're still stingy even with that tier. I frequently hit the usage limits on the Claude Pro plan, which is why my usage appears lower. I tend to use it for more complex tasks when other models struggle, although there are cases where other models solve problems Claude cannot. I like having access to the top three frontier models and the different features each platform offers as well.

Show thread

Pratik Patel Mar 22

@chikim I'm using Claude right now. I need to experiment with others more. I'm not fond of Codex's desktop environment. I'll have to think about this more. What's your verdict on the best model for a local chat experience with ollama?

Show thread

Chi Kim Mar 22

@ppatel Re Ollama, it depends on your use case, but under 40B parameters, qwen3.5:35b, gemma3:27b, mistral-small3.2:24b, and gpt-oss:20b are among the best options. Open-weight models, especially Qwen3.5, have improved significantly, but they still do not come close to the top frontier models IMO.

Show thread

Pratik Patel Mar 22

@chikim I'm more aiming for passing things for private transcription, or a notes app that I just saw. General chat with topics that fronteer models disallow. I usually let Claude play with changes unless I have a specific case where I'm going to ask for feedback or I want it to ask me for feedback on something I'm doing.

Show thread

Chi Kim Mar 22

@ppatel Yeah for general use like that, those are still the top choices. For speed, gpt-oss-20b is fastest. If you want the best quality, probably qwen3.5-27B. It's slow bc dense model, and not optimized on Ollama.

Show thread

Pratik Patel

@chikim Besides, I can download more than one model.