maybe i'm just not good enough of a programmer to use coding agents, i guess? i definitely don't trust my ability to know whether or not some code will do what i want it to do just by looking at it
@aparrish I don't even look at the code the agents write, or at least not much. It works better for things that you can build good test suites for or where you care more about the output of the program than the way the program works. See also @simon's book on agentic programming.
Agentic Engineering Patterns - Simon Willison's Weblog

Simon Willison’s Weblog
@nelson i don't trust my tests to be correct either, only that they reflect my best understanding. and i'm not sure what it could mean to care more about the output of a program than how the program works...? isn't the output of a program *determined by* how the program works? i feel like whenever i've believed there was a difference between those two things, i ended up being wrong (sometimes subtly, sometimes not)

@aparrish @nelson I don't think it's enough to accept code is just a black box ratcheted by tests.

If you look at the state of Claude code... It's really bad. Like worst case devolve to bogo sort bad... like store your credentials in plain text files because it can't guarantee it won't lose your credentials mid process bad.

edit: Ratcheting by tests doesn't tell you about non-deterministic total failure in rare circumstances and it doesn't tell you about security.

@theeclecticdyslexic @nelson yeah, every instinct i have from 20+ years in software dev says "if the output looks right, and the code passes the tests, but you don't actually understand it, and you push to prod/incorporate it into your workflow anyway, you are bound to spend 10x the time fixing it than you would have spent understanding it in the first place" but maybe others don't have that instinct?
@aparrish @nelson well, if the LLM knows what the tests are, and you don't read the code it writes... You simply can't know it didn't write dedicated code paths for your tests.