AI-generated code contains more bugs and errors than human output

https://reddthat.com/post/56712364

AI-generated code contains more bugs and errors than human output - Reddthat

Lemmy

I’ve been coding for a while. I did an honest eager attempt at making a real functioning thing with all code written by AI. A breakout clone using SDL2 with music.

The game should look good, play good, have cool effects, and be balanced. It should have an attractor screen, scoring, a win state and a lose state.

I also required the code to be maintainable. Meaning I should be able to look at every single line and understand it enough to defend its existence.

I did make it work. And honestly Claude did better than expected. The game ran well and was fun.

But: The process was shit.

I spent 2 days and several hundred dollars to babysit the AI, to get something I could have done in 1 day including learning SDL2.

Everything that turned out well, turned out well because I brought years of skill to the table, and could see when Claude was coding itself into a corner and tell it to break up code in modules, collate globals, remove duplication, pull out abstractions, etc. I had to detect all that and instruct on how to fix it. Until I did it was adding and re-adding bugs because it had made so much shittily structured code it was confusing itself.

TLDR; LLM can write maintainable code if given full constant attention by a skilled coder, at 40% of the coder’s speed.

It depends on the subject area and your workflow. I am not an AI fanboy by any stretch of the imagination, but I have found the chatbot interface to be a better substitute for the “search for how to do X with library/language Y” loop. Even though it’s wrong a lot, it gives me a better starting place faster than reading through years-old SO posts. Being able to talk to your search interface is great.

The agentic stuff is also really good when the subject is something that has been done a million times over. Most web UI areas are so well trodden that JS devs have already invented a thousand frameworks to do it. I’m not a UI dev, so being able to give the agent a prompt like, “make a configuration UI with a sidebar that uses the graphql API specified here” is quite nice.

AI is trash at anything it hasn’t been trained on in my experience though. Do anything niche or domain-specific, and it feels like flipping a coin with a bash script. It just throws shit at the wall and runs tests until the tests pass (or it sneakily changes the tests because the error stacktrace repeatedly indicates the same test line as the problem).

Yeah what you say makes sense to me. Having it make a “wrong start” in something new is useful, as it gives you a lot of the typical structure, introduces the terminology, maybe something sorta moving that you can see working before messing with it, etc.