this is an interesting article about LLM generated code (an sqlite rewrite in rust) and the difference between "it works" and "it's good". also interesting database stuff :)
https://blog.katanaquant.com/p/your-llm-doesnt-write-correct-code
this is an interesting article about LLM generated code (an sqlite rewrite in rust) and the difference between "it works" and "it's good". also interesting database stuff :)
https://blog.katanaquant.com/p/your-llm-doesnt-write-correct-code
@sushee Database code is probably one of the areas where you need the most expertise. You may need to understand minuscule differences between multiple OS in order to reach adequate performance. Don’t get me started on timezones or string comparisons.
We had several threads on my discord how seemingly trivial tasks become herculean efforts in a database context.
This seems to be the rewrite sql in rust with LLM project the author is critiquing. The closed issue linked is asking for comments on the performance analysis ;-)
https://github.com/Dicklesworthstone/frankensqlite/issues/18#issue-4037436475
Just for reference: the author’s post on Twitter: https://x.com/KatanaLarp/status/2029928471632224486
@sushee This sentence is probably the most important in the article: "My conclusion is that LLMs work best when the user defines their acceptance criteria before the first line of code is generated."
I put together an IRC server for CL with TDD methodology. For a project this size, I found this is what works best.
I haven't tried a strict TDD approach yet, but that's been an idea I've had since I've started to think about the best way to use it as a tool beyond just vibecoding.
Cool to see an actual attempt at it
@eccles
I have now a handful of projects at the same space made with TDD: cl-jsonpath, fast-csv, and a private one which is even bigger. This is the only one method that works. Tests define the api even before the implementation, provide continuous feedback and a roadmap. The author must be vigilant when designing the tests and during implementation because the LLM tends to do shortcuts sometimes, but otherwise it mostly smoothly guides development.
@SignorMacchina
I would say requirements and acceptance criteria that is not formalized is totally worthless.
@hajovonta @sushee What's been really infuriating about the uptake of agentic coding is the (re)discovery of software engineering principles like proper design documentation, specification, and acceptance testing. We've known the importance of all these things since the 60s and 70s but typically don't spend time on them because coding is (was?) more enjoyable, writing about the code was perceived as less valuable than implementing the code, and having a formal structured process was suffocating, Legacy, and not 'agile' enough.
Nobody would write out this critical information for human use but devs are suddenly overjoyed to write it all down now that they have expensive obsequious incompetent plagiarizing coding robots.
It's like every episode of The Simpsons with Homer being an idiot and doing the right thing for all the wrong reasons.
There's so much anti-humanity bundled up in the commercial LLM space, it's infuriating and depressing.
@arclight
Yeah, actually writing the code is the boring stuff, mostly mechanical and error-prone. I did it for decades. Having a tool doing this part is great, because we can focus on what is more enjoyable: coming up with ideas, planning, setting up scope, providing oversight, controlling the process from a higher level, verifying results.
It is not everyone's cup of cake, I get that. It's not that it takes away the possibility of writing code by hand.

Dedicate your time and money to open source. One of the nice benefits? Jealous, bad-faith losers can try to benchmark your unfinished code (that you never once claimed is done or ready for review) and then try to claim you’re a charlatan. This guy should be shunned and ignored.
@sushee still far too optimistic
> An experienced database engineer using an LLM to scaffold a B-tree would have caught the is_ipk bug in code review because they know what a query plan should emit
good luck if all you've been doing for a couple years is deskilling yourself.
And even if that were true, the horrible externalities of llms are still there
@sabik @sushee yeah, I don't get it either, particularly the SQLite example.
"You can let the slop machine generate a half-broken version that looks like the real thing but it's actually slow and incorrect"
On the one hand, battle-tested, load-bearing software that shipped on a shitload of platforms (it's basically in every smartphone and PC operating system these days) and has been there for decades.
On the other hand...you can cosplay as a database dev?
What are we doing here even?