Just read a bit of a thread about using AI for code generation, and it eang true for me: when programming, you aren’t just implementing a spec (ok, sure some people are); you’re testing the theory that the spec describes and, through that process, identifying falsehoods, corner cases, or omissions. If you leave it all to the LLM, none of that happens - so what gets implemented is, at best, exactly the spec. Then turn over testing to an LLM as well, and fewer chances to test and challenge the spec. Eventually, with specs written by LLMs as well, what you actually get is potentially completely unrelated to the problem you want to solve, but there’s no way you can know until it’s much too late.
@nonspecialist Yeah, the approaches I see most people use are repeat the prompt until it works, or get more and more and more specific. Unless the AI nails it the first try, because its a common flow you end up spending more time that it would have been to just write the code.
@smarthall the issue as I see it is that “it works” is likely a less correct or complete production with LLMs than without; and for limited use, with well constrained problem domains that might be fine, until it very much isn’t.

@nonspecialist have you looked at the "compiler" or the "browser" that AI agents supposedly wrote "from scratch". The compiler can't optimize, assemble or link, and the browser doesn't even compile. They can't refactor the compiler any further or it breaks in weird ways, and the browser cost trillions of tokens and uses libraries for core functionality. Pretty good demonstrations of exactly what you are saying.

Though I would say "make the damn thing compile" is a pretty well constrained problem....

@smarthall yeah I saw that - it’s a daft thing to try and do from scratch with a statistical engine, that’s tens of thousands of hours of actual Thinking and Deciding if not more. Not surprised by the result at all