this is an interesting article about LLM generated code (an sqlite rewrite in rust) and the difference between "it works" and "it's good". also interesting database stuff :)

https://blog.katanaquant.com/p/your-llm-doesnt-write-correct-code

Your LLM Doesn't Write Correct Code. It Writes Plausible Code.

One of the simplest tests you can run on a database:

Vagabond Research

@sushee This sentence is probably the most important in the article: "My conclusion is that LLMs work best when the user defines their acceptance criteria before the first line of code is generated."

I put together an IRC server for CL with TDD methodology. For a project this size, I found this is what works best.

https://git.sr.ht/~hajovonta/cl-irc-server

@hajovonta
@sushee

I haven't tried a strict TDD approach yet, but that's been an idea I've had since I've started to think about the best way to use it as a tool beyond just vibecoding.

Cool to see an actual attempt at it

@eccles
I have now a handful of projects at the same space made with TDD: cl-jsonpath, fast-csv, and a private one which is even bigger. This is the only one method that works. Tests define the api even before the implementation, provide continuous feedback and a roadmap. The author must be vigilant when designing the tests and during implementation because the LLM tends to do shortcuts sometimes, but otherwise it mostly smoothly guides development.

@sushee

@hajovonta @sushee Requirements and Acceptance Criteria should always be defined before writing a single line of code, so this conclusion is totally worthless.

@SignorMacchina
I would say requirements and acceptance criteria that is not formalized is totally worthless.

@sushee

@hajovonta @sushee What's been really infuriating about the uptake of agentic coding is the (re)discovery of software engineering principles like proper design documentation, specification, and acceptance testing. We've known the importance of all these things since the 60s and 70s but typically don't spend time on them because coding is (was?) more enjoyable, writing about the code was perceived as less valuable than implementing the code, and having a formal structured process was suffocating, Legacy, and not 'agile' enough.

Nobody would write out this critical information for human use but devs are suddenly overjoyed to write it all down now that they have expensive obsequious incompetent plagiarizing coding robots.

It's like every episode of The Simpsons with Homer being an idiot and doing the right thing for all the wrong reasons.

There's so much anti-humanity bundled up in the commercial LLM space, it's infuriating and depressing.

@arclight
Yeah, actually writing the code is the boring stuff, mostly mechanical and error-prone. I did it for decades. Having a tool doing this part is great, because we can focus on what is more enjoyable: coming up with ideas, planning, setting up scope, providing oversight, controlling the process from a higher level, verifying results.

It is not everyone's cup of cake, I get that. It's not that it takes away the possibility of writing code by hand.

@sushee

@hajovonta @sushee I agree that's probably the closest thing to a conclusion/main takeaway of the article, although it also became obvious to me about half an hour into my first time using an LLM to work on code, so I don't consider it particularly groundbreaking.