Mastodawn

I'm fundamentally a tool builder, and LLM coding agents work one million times better if you give them good tools, and I wrote a thing about this

https://john.regehr.org/writing/zero_dof_programming.html

zero_dof_programming

Show thread

Jamey Sharp Mar 26

@regehr If we build tools that actually give us zero degrees of freedom, surely there are more efficient and reliable ways to use them than LLMs?

Given that, as you note, zero DOF is only aspirational, I would love to see more work along the lines of the Termite project for synthesizing device drivers. Version 1 took the provided constraints on the behavior of both the device and the OS, did a bunch of computation, and tried to spit out C source without human intervention. Termite 2 took the same inputs, but gave developers an IDE that would auto-complete large chunks when there were no valid alternatives, then prompt the programmer for the few decisions that were left. I think there are lessons I'd like to see more people learn there.

Show thread

John Regehr Mar 26

@jamey "If we build tools that actually give us zero degrees of freedom, surely there are more efficient and reliable ways to use them than LLMs?"

the distinction is between recognizing a good solution and creating a good solution. the former is much easier!

Show thread

Jamey Sharp Mar 26

@regehr sure, I understand that distinction. like verifying an NP-Hard solution versus generating one, though I know you can tell me all about how these program verification tasks are themselves often NP-Hard. I just think it would be a shame to drop the existing research on program synthesis in favor of something that generates vaguely-guided random text in a long feedback loop. I mean if we really reach zero DOF, I think an existing coverage-guided fuzzer ought to give better results faster than an LLM

Show thread

John Regehr Mar 26

@jamey I think we all hope to avoid relying on these big corporate plagiarism machines that are out of our control, so I want you to be right!

Show thread

Jamey Sharp Mar 26

@regehr I was trying to avoid phrasing it that way, but yes, very much that 😂

Show thread

mirth Mar 26

@jamey @regehr I share your concern in this respect but I don't think fuzzers and LLM-assisted search can be made equivalent. At the margin, many existing codebases are already heavily fuzzed and the surfaced issues fixed. What remains are "weird" issues that violate intended invariants in ways that don't generate a fuzzer-visible crash. Use LLM to add instrumentation, so the fuzzer can fuzz, and the combination finds new bugs.

Show thread

John Regehr

@mirth @jamey so here's our paper (the one referenced in the post) using randomized synthesis. this is as good as we can do, so far. we might be able to do better with more work, but I don't know. but regardless, the difference between the LLM and randomized synthesis is night and day, it's not even close, and I strongly doubt we can close this gap.

https://users.cs.utah.edu/~regehr/papers/popl26.pdf