Mastodawn

When they update the diceroller, diceroll-reliant processes go sideways.

https://github.com/anthropics/claude-code/issues/42796

[MODEL] Claude Code is unusable for complex engineering tasks with the Feb updates · Issue #42796 · anthropics/claude-code

Preflight Checklist I have searched existing issues for similar behavior reports This report does NOT contain sensitive information (API keys, passwords, etc.) Type of Behavior Issue Other unexpect...

GitHub

I don't know how you'd even begin to apply regression testing to probabilistic GenAI systems, or maintain codebases whose lifespans are intended to far outlast the next major GenAI model update. Seems like an underappreciated risk.

Show thread

Taggart 19h ago

@DaveMWilburn I think the trick is to minimize the nondeterminism, so that the mission-critical parts are traditional automation and the LLM is handling connective tissue. Still not safe, but it's imo a necessary mitigation for pipelines like this.

Show thread

Taggart 19h ago

@DaveMWilburn That said, whenever I ask "Hey what happens in the event of (model collapse | service outage | service shutting down)?" everyone starts slowly backing away from me

Show thread

Ian Campbell 🏴

@mttaggart @DaveMWilburn TLDR, the current hotness is deterministic hooks to validate the nondeterministic output.

JAGS put out a control plane that I need to make time to play with: https://github.com/juanandresgs/claude-ctrl

GitHub - juanandresgs/claude-ctrl: The Systems Thinker's Deterministic Claude Code Control Plane

The Systems Thinker's Deterministic Claude Code Control Plane - juanandresgs/claude-ctrl

GitHub