I'm fundamentally a tool builder, and LLM coding agents work one million times better if you give them good tools, and I wrote a thing about this

https://john.regehr.org/writing/zero_dof_programming.html

zero_dof_programming

@regehr If we build tools that actually give us zero degrees of freedom, surely there are more efficient and reliable ways to use them than LLMs?

Given that, as you note, zero DOF is only aspirational, I would love to see more work along the lines of the Termite project for synthesizing device drivers. Version 1 took the provided constraints on the behavior of both the device and the OS, did a bunch of computation, and tried to spit out C source without human intervention. Termite 2 took the same inputs, but gave developers an IDE that would auto-complete large chunks when there were no valid alternatives, then prompt the programmer for the few decisions that were left. I think there are lessons I'd like to see more people learn there.

@jamey "If we build tools that actually give us zero degrees of freedom, surely there are more efficient and reliable ways to use them than LLMs?"

the distinction is between recognizing a good solution and creating a good solution. the former is much easier!

@regehr sure, I understand that distinction. like verifying an NP-Hard solution versus generating one, though I know you can tell me all about how these program verification tasks are themselves often NP-Hard. I just think it would be a shame to drop the existing research on program synthesis in favor of something that generates vaguely-guided random text in a long feedback loop. I mean if we really reach zero DOF, I think an existing coverage-guided fuzzer ought to give better results faster than an LLM
@jamey I think we all hope to avoid relying on these big corporate plagiarism machines that are out of our control, so I want you to be right!
@regehr I was trying to avoid phrasing it that way, but yes, very much that 😂
@jamey @regehr I share your concern in this respect but I don't think fuzzers and LLM-assisted search can be made equivalent. At the margin, many existing codebases are already heavily fuzzed and the surfaced issues fixed. What remains are "weird" issues that violate intended invariants in ways that don't generate a fuzzer-visible crash. Use LLM to add instrumentation, so the fuzzer can fuzz, and the combination finds new bugs.
@jamey @regehr I suspect it's possible to construct ethically trained models capable of doing this work, though no such thing exists today.
@mirth @jamey I'm hoping for this