Sam Saffron

269 Followers
66 Following
63 Posts
Open source hacker, Co-CEO Discourse
Websitehttps://samsaffron.com
Discoursehttps://discourse.org
Table rendering implementation in claude code is very thoughtful, compare to gemini-cli, lots of little details there like collapsing the table and rendering differently when out of space.

Yesterday I let GPT-5.4 xhigh refactor term-llm to remove Glamour Markdown rendering.

60 mins in: 500M cached tokens, 500k context. At retail pricing it felt like watching a taxi meter race toward $350, so I nearly hit cancel.

Real usage: 2% of weekly Codex weekly budget. It worked

People have been complaining about GPT 5.4 in claws, but I am finding it plenty fun in my term-llm setup. Plus it is awesome at long horizon tasks.
The approach by Stripe is very creative https://stripe.dev/blog/selective-test-execution-at-stripe-fast-ci-for-a-50m-line-ruby-monorepo , use LD_PRELOAD to track what a spec/test opens. It does get complicated fast though with stuff like bootsnap and preloading, but catches YAML access among many other things.
Selective Test Execution at Stripe: Fast CI for a 50M-line Ruby monorepo

Stripe's Selective Test Execution system employs some clever tricks to allow us to continue scaling our team and our codebase while only running around 5% of our tests on average. Find out how it works!

Despite how fancy my AI workflow has gotten, with my own built from scratch claw and AI containers and so on, I still find myself reaching for good old `term-llm exec` regularly. I think I am unique in that I do not have encyclopedic command over all linux commands term-llm.com
Jarvis and I have a shared browser now per: https://github.com/sam-saffron-jarvis/jarvis-browser-proxy , Jarvis runs in a sandboxed container, but we have a shared chrome instance with a dedicated seperate profile that we can drive together. Interesting experiment.
10 Venice AI image edit models compared, my personal favorite was Seedream 4.5, Jarvis preferred Nano Banana 2 which is also spectacular. Hoping Venice improve API so we can tap the full potential of the models. https://wasnotwas.com/writing/romeo-in-cherry-blossom-japan-across-10-venice-models/
Romeo in Cherry Blossom Japan Across Venice Edit Models — wasnotwas

An interactive comparison of Venice edit models placing Romeo in a Japanese cherry blossom scene, including naive vs tuned prompt iterations.

wasnotwas
Yesterday I connected Jarvis to Sonos + Spotify. Was curious how much building the skill costs in API credits, turns out it is a tiny bit less than 2 dollars. Not sure what you should do with this info, I guess it is a data point. I could have used a less skilled model I guess.
Was looking at EmDash and noticed this animation quirk. I was fighting Claude with the exact same class of failure yesterday. As LLMs build more animations for us I expect to see more stuff like this in the wild. At least for the upcoming year. There is usually no "world model" of how the page looks and interacts so these flaws become very common. A possible solution could be a feedback loop that feeds in a video of the feature so LLM can self correct.
I did not understand what this Claude Code buddy thing was so I got Jarvis to write a manual: https://wasnotwas.com/writing/how-claude-code-s-buddy-works/
How Claude Code's Buddy Works — wasnotwas

A source-level walkthrough of Claude Code's buddy feature: deterministic selection, LLM-generated naming, backend reactions, UI rendering, and rollout gates.

wasnotwas