The hidden beauty of vibe coding

"It passed all the unit tests, the shape of the code looks right," he said. It's 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It's a dumpster fire. Throw it away. All that money you spent on it is worthless."

https://www.theregister.com/2026/03/17/ai_businesses_faking_it_reckoning_coming_codestrap/

AI still doesn't work very well, businesses are faking it, and a reckoning is coming

interview: Codestrap founders say we need to dial down the hype and sort through the mess

The Register
@gerrymcgovern
"AI still doesn't work very well, businesses are faking it, and a reckoning is coming"

@gerrymcgovern
Vibe coding is fine as long as you use it for personal projects or for proofs of concept.

As soon as you start using it for "real" applications, you lost.

@PatrickTingen @gerrymcgovern "vibe coding" is a euphemism for plagiarism. even for personal projects or proofs of concept, plagiarism is unethical and impermissible. not to mention there's substantial cost and harm for every single prompt.

@poprox @PatrickTingen @gerrymcgovern

Given that, when a model is trained with GPL code, it has to be GPL'ed itself, would be fair.
No fun to say, but I think copyright/IP issues have a long way to go until the whole thing is settled and needless to say, is has to be fair

@poprox @PatrickTingen @gerrymcgovern The problem is that all coders take from other people, copy code, and vibe coding does this better, in theory.

There is a lot of harm, and there are - IME - better ways of ding this. Or at least, there were, until they were all enshittified.

@PatrickTingen
And a million people vibe coding personal projects also help the earth to burn. There, we all lost because of them. 🤬
@gerrymcgovern
@gerrymcgovern The stats for the vibe-coded SQLite rewrite in Rust being literally thousands of times slower than SQLite are simply wild. Such as needing almost 2 seconds to do 100 single-ID lookups. I'm pretty sure I could improve on that operation just by slurping a CSV with unique row IDs into memory and doing a binary search on said row IDs. Would I want to? No, but I'm also not *trying* to build an actual RDB engine either.

@dpnash @gerrymcgovern

Do you have a link you would like to share about this 1000 times slower implementation of a SQLite instance?

An AI Wrote 576,000 Lines to Replace SQLite.

AI coding is the fastest-growing developer tools category in history. It may also be the first where the best controlled evidence suggests…

Medium

@curious_carrot @dpnash @gerrymcgovern yes. (and you have it too.)

oh, and it's 20000 times slower, actually.

@gerrymcgovern Tired of explaining that more lines of code does not equal more productivity. I don't think there will be a reckoning, they will expect us to get used to cloud outages instead and just accept 'the robot ate my homework' as an excuse.

@gerrymcgovern This is an unusual article. It mixes truth and misconceptions in awkward ways.

For example:

Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI

This isn't what happened. It was a C Compiler that was rewritten. A different tester then rebuilt SQLite using both the AI and the official one. The AI one did worse.

But it did worse for very specific reasons. The AI version was only tested for correctness. It was only given unit tests as a parameter for success. It failed on real world performance tests, because it was never actually given that as a requirement.

Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence."... Measures of engineering excellence, said Smiley, include metrics like deployment frequency, lead time to production, change failure rate, mean time to restore, and incident severity.

So these are famously known as the DORA metrics. And they don't measure engineering excellence, ... /1

@gerrymcgovern ... they measure the capabilities of the engineering platform along with the expertise of the people using that platform.

There are lots of companies with excellent engineers and crummy DORA scores because they don't have the institutional support to improve those metrics. Nor does the score mean the business is successful. You can have great DORA metrics and still lack for paying customers.

"The other challenge here is that the incentives are misaligned,"

But then he proceeds to list a bunch of examples for competing incentives. His examples of "misaligned" are really examples of "I would like to deliver less and get paid more"... /2

@gerrymcgovern ...

If there's an incentives problem here, it's that companies have been paying for a lot of BS rituals and they're discovering that the BS generating machine is undermining part of the ritual. Companies have also been getting away with under-specifying success in order to pad results as "good". But Gen AIs will "fill in" the under-specificity with made up data. Or they will fail to deliver anything into the gap that some human was hoping would be filled.

But none of this is "misaligned". It's intentional ambiguity designed to protect business units. The AI is just exposing the BS for what it is.

OP is kind of talking about that BS problem. But he's taking weird micro angles to view subsets of the problem without calling out the greater problem. He's not wrong, but he's also not really right either. 🤷🏻‍♂️ //

@gatesvp @gerrymcgovern

"This isn't what happened. It was a C Compiler that was rewritten. A different tester then rebuilt SQLite using both the AI and the official one. The AI one did worse."

There seems to be some confusion on your part. I suspect you're thinking of Claude's C Compiler, which made a hash of building SQLite (although it's impressive it managed that at all).

If you follow the links from this article, they're referring to an analysis of FrankenSQLite (https://frankensqlite.com/), which is billed as a "clean room reimplementation", from the same guy who vibe coded an 80k LOC disk cleanup daemon to replace a one-line cron job. 🙄

And it runs. It's just awful.

https://blog.katanaquant.com/p/your-llm-doesnt-write-correct-code

FrankenSQLite — The Monster Database Engine for Rust

A clean-room Rust reimplementation of SQLite with MVCC concurrency, RaptorQ self-healing, and zero unsafe code. 26-crate workspace delivering the monster database engine.

FrankenSQLite

@richh @gerrymcgovern

The article quote is this:

It's 3.7x more lines of code that performs 2,000 times worse

Which is very close to this experiment here. They get over 3x the compiled size & slowdown numbers as bad as 150k times worse.

I had no idea about FrankenSqlite.

Based on the tenor of the original article though, it sounds like he was talking about the CCC experiment because that was a comparable attempt. This Franken thing actually tries to improve upon SQLite.

OP's article Links to a medium article which doesn't link to either the CCC thing or the Franken thing. The Medium article seems to reference the Franken thing, but doesn't link it directly either. OP's article seems like it could be either one 🤷🏻‍♂️

Thanks for the extra data, I think it's notable that the arguments I make stay the same either way. 😃

Even for FrankenSQLite, It's clear that they had a limited scope of performance tests.

GitHub - harshavmb/compare-claude-compiler: Comparison of GCC vs CCC

Comparison of GCC vs CCC. Contribute to harshavmb/compare-claude-compiler development by creating an account on GitHub.

GitHub

@gatesvp @gerrymcgovern
"OP's article Links to a medium article which doesn't link to either the CCC thing or the Franken thing."

Please follow the links. Like, really read the articles in detail.

OP's article from The Register (https://www.theregister.com/2026/03/17/ai_businesses_faking_it_reckoning_coming_codestrap/)

links to >

Medium (https://medium.com/write-a-catalyst/an-ai-wrote-576-000-lines-to-replace-sqlite-7ea538826d72)

links to >

Katanaquant (https://blog.katanaquant.com/p/your-llm-doesnt-write-correct-code)

links to >

FrankenSQLite.

It's all there.

I don't discount that Smiley might have mixed up their stats in the Reg interview with the 3.7x LOC quote, since that is quite similar to the CCC figures.

Both are relevant to the articles tenor though - AI can output code that will run, but lacks domain knowledge and will do exactly what you ask it to, even if it's ridiculous ("Hey, write me a cleanup tool", instead of "what's the best way to do this cleanup - 1 line cron job").

AI still doesn't work very well, businesses are faking it, and a reckoning is coming

interview: Codestrap founders say we need to dial down the hype and sort through the mess

The Register

@richh @gerrymcgovern

It's all there.

3 links deep.

Honestly, there's a whole separate discussion to be had about how lazy the OP article actually is.

  • They quoted descriptions of DORA metrics without actually linking to those.
  • They generated this whole line of confusion by referencing an article that referenced an article that referenced a specific technology... That they could have just linked to directly
  • They have a whole section on consultants that's just quotes from one person with absolutely zero external links.

It's quite possible that you and I have not spent more time analyzing the article than the author spent and actually writing it. 😃

Which probably means we need to start writing our own and better articles. 😃

@gatesvp @gerrymcgovern TBH we’re lucky to get links at all. So many news articles will open “a report published today has found that…” but will the article itself link to the report? Not a chance - they don’t want you understanding the nuance or reading past their summarisation. So props to them here for literally providing some sources, even if they (or the interviewee) is muddling up different projects. We’re all learning something new from it!

@gerrymcgovern You love to see it:
"Insurers, he said, are already lobbying state-level insurance regulators to win a carve-out in business insurance liability policies so they are not obligated to cover AI-related workflows. "That kills the whole system," Deeks said."

Someone will be left holding the bag when slop kills someone, bankrupts a business, or causes serious damage. Insurers will make sure it isn't them.

@gerrymcgovern message from management: I didn’t read your test report, but our AI assistant told us we should fire all but one software engineer and push to prod, whatever that means. Let us know how it goes, we’ll be at the money burning party our AI assistant scheduled for us.
Your LLM Doesn't Write Correct Code. It Writes Plausible Code.

One of the simplest tests you can run on a database:

Vagabond Research
@gerrymcgovern KLOC as a measurement of productivity has come back into fashion yet again, for the *umpteenth* time. We never learn. Welcome to the 1970s.

@gerrymcgovern there's so much good stuff in there

> Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence."

@autonomousapps @gerrymcgovern "But there's no playbook to pull from" - complete nonsense. All the existing good software practices still apply.

@matthewskelton @autonomousapps @gerrymcgovern

"But if I apply proper practice, I practically need to reject 100% of the generated code."

Yes. Yes indeed.

@matthewskelton @gerrymcgovern but were we actually following those practices? (in general)
@autonomousapps @gerrymcgovern some of us, yes. Most organisations? No...

@matthewskelton @autonomousapps @gerrymcgovern
That would require taking full responsibility for the code, moving away from today's agentic coding back to using LLMs as an assistant only.

And to make that term less vague, I'm picturing just consulting the chat interface on an as needed basis, allowing only the most mundane self contained function commits maybe, that's it.

@orchun @matthewskelton @gerrymcgovern the institutional pressures are seemingly insurmountable in many cases. I myself worked somewhere where they counted token usage and treated that as an important input in perf review. Managers frequently said "you should use ai more." This is not uncommon. Many tens of thousands of jobs have been destroyed with "ai" as the putative cause.

Silicon Valley may have to burn to the ground before we can change directions on this

@autonomousapps we're being threatened with a good time again? 🤣

@orchun @matthewskelton @gerrymcgovern

@bweller @autonomousapps @matthewskelton @gerrymcgovern
Really the core problem hasn't changed, exact same issue as the pre-LLM era: managers who don't know much about software development feel free to dictate how it should be done.

Agile was an attempt to take that control back, and they co-opted it.

Devs weren't the target audience.

@gerrymcgovern here’s the thing: a 2000x performance degradation might not matter when you still have to find your first customer. Finding that first customer 10x faster might.
@gerrymcgovern ooh, I wonder whether it’s insurances (or lack thereof) that’s going to puncture the bubble…

@gerrymcgovern

In addition to the often abysmal quality, there's also a big misunderstanding about what the role of the software engineer is in a project. It's not to produce code, but to vouch that the software created is fit for purpose. "Tests pass" can ever be only a tiny part of that.

Letting people use generated code you did not comb through and verify, shows that you have no respect for those people.

More: https://fosstodon.org/@ttiurani/116186757320229737

Timo Tiuraniemi (@[email protected])

Thread on #VibeCoding In my 25 years of professional software engineering, I've written big chunks of code in about ten programming languages, and smaller bits with five more. Still, if I was tasked to build software with OCaml – a language I only know from Wikipedia – I'd need to take a long hiatus to learn OCaml before I could let anyone use a single line of code I've written in it. Why? Because software engineering is so much more than generating lines of code. 1/6

Fosstodon
@gerrymcgovern Screw AI. The planet doesn't want it. I don't want it. You don't want it. Only shitty BigTech wants it, and the AI bubble is about to pop. 🫧 🪡