Mastodawn

I just used GPT-5.3-Codex-Spark for a simple implementation and it ran out of usage in under 5 minutes.

*This* is what will ultimately cause the current business model to fail outside of big tech and enterprise.

You can't claim the correct usage of an LLM is to pop it in a verification loop, while also charging a per-token access/usage fee that means it's unusable for that purpose.

Perhaps in a few years we'll see local LLMs that work well on consumer hardware with the same capabilities?

Show thread

Joe Groff (1M Context)Mar 9

@tonyarnold on the flip side, with more verification guardrails you might be able to get competitive results out of a smaller local model. if you're lucky enough to have a macbook pro or mini with 48gb of ram, that can fit qwen3.5-35b-a3b. its initial results have not impressed me, but if you make it try again until some tests pass you can eventually get something correctish out of it

Show thread

Tony Arnold Mar 9

@joe I'd love to see local models become the "standard" for using this technology, but I have had similar results on a 64Gb RAM M3 Max MBP — it's too slow to really be useful.

I hear a lot of folks are having good results with that model, though - I just can't replicate them at any kind of speed.

Show thread

Colin Cornaby

@tonyarnold @joe I was using Qwen 3.5 35b last week to do some OpenGL and Metal. Asking it to do larger tasks that involved shader architecture led to some... interesting results. But it at least put out some shader code that made for a halfway decent starting point once it was cleaned up.

I sort of wish we could go back to "these things are weird little assistants" instead of "these things are complete software engineers."

Show thread

Tony Arnold Mar 9

@colincornaby @joe saying they're something when they are not does not make them that thing.

I know the US has fallen into a bit of a post-truth era, but these things *are* assistants for any kind of professional use.

Show thread

Colin Cornaby Mar 9

@tonyarnold @joe I think the hard part is software development is so diverse.

If you work on login pages or todo apps these things do look a bit more like software developers in a box.

I know web front end got way more complicated over the last few decades. But I still think it was a mistake to pull everything under the software engineering umbrella.