Mastodawn

🤯 I'm definitely getting value from the $60/month combined subs! I asked Codex to generate a script to scans my home folder and produce a usage report. It turns out I have used 496M tokens on Codex, 357M on Gemini, and 166M on Claude. That's over 1 billion tokens since last October, which averages roughly 56M/week or 8M/day. About 65% is cached input tokens though. This is only for agentic workflows not include usage on web or mobile apps. I haven't tried OpenClaw yet. #AI #LLM

Show thread

Pratik Patel

@chikim I'm curious to see if that's better than just paying for Claude Max. I suppose it all depends on how much monthly use you get out of them combined.

Show thread

Chi Kim Mar 22

@ppatel Yeah maybe. Claude Max costs $100, and from what I hear, they're still stingy even with that tier. I frequently hit the usage limits on the Claude Pro plan, which is why my usage appears lower. I tend to use it for more complex tasks when other models struggle, although there are cases where other models solve problems Claude cannot. I like having access to the top three frontier models and the different features each platform offers as well.

Show thread

Pratik Patel Mar 22

@chikim I'm using Claude right now. I need to experiment with others more. I'm not fond of Codex's desktop environment. I'll have to think about this more. What's your verdict on the best model for a local chat experience with ollama?

Show thread

Chi Kim Mar 22

@ppatel Yeah, I live on the edge and just use CLI for all three in yolo mode (AKA dangerously bypass all), so they never ask me to aproove something. Using three different desktop GUIs is annoying, especially since accessibility is not great. The CLI has its own quirks with TUI though. lol

Show thread

Chi Kim Mar 22

@ppatel Re Ollama, it depends on your use case, but under 40B parameters, qwen3.5:35b, gemma3:27b, mistral-small3.2:24b, and gpt-oss:20b are among the best options. Open-weight models, especially Qwen3.5, have improved significantly, but they still do not come close to the top frontier models IMO.

Show thread

Pratik Patel Mar 22

@chikim I'm more aiming for passing things for private transcription, or a notes app that I just saw. General chat with topics that fronteer models disallow. I usually let Claude play with changes unless I have a specific case where I'm going to ask for feedback or I want it to ask me for feedback on something I'm doing.

Show thread

Chi Kim Mar 22

@ppatel Yeah for general use like that, those are still the top choices. For speed, gpt-oss-20b is fastest. If you want the best quality, probably qwen3.5-27B. It's slow bc dense model, and not optimized on Ollama.

Show thread

Pratik Patel Mar 22

@chikim At this point, speed isn't as important as quality. I don't particularly care how long something takes when I give it a task.

Show thread

Chi Kim Mar 22

@ppatel Yeah exactly. If you haven't tried any of those, try them first. Qwen3.5 models are the newest, and they are also multimodal.

Show thread

Pratik Patel Mar 22

@chikim Besides, I can download more than one model.