Software development is evolving from writing code to supervising AI agents that write the code. The biggest challenge to resolve is how to maintain or even increase code quality when you’re generating 2x - 10x the code you used to deliver a year ago.

Humans reading all of the AI-generated code doesn’t scale.

As someone who started my career as a test engineer at Microsoft, I think there’s a lot of opportunity to actually increase code quality by leveraging AI to reinvent how testing is done.

@carnage4life I’m not a technical person at all, but your judgment reflects mine. There are hundreds of interesting, challenging, provocative use cases for AI… that we aren’t talking about at all. We are just talking about the pretend use cases and laying people off.

BTW I’m not sure the real use cases are worth it either, but how would we ever know?

@carnage4life LLM can’t write tests. LLM just creates slop. If we are ok with very unstable and insecure software creating ten times of it is sure ok
@carnage4life so I have a noddy question: why do we want more code? We want better products/service but we do not need ot want the crazy bloat I see currently. The metric of measuring productivity by lines of code was a bad idea with humans and is terrible with AI.

@carnage4life Re AI and quality, I think code level api/injection/unit tests are great for llms.

However end to end measuring (which largely replaced testing) requires understanding whether any given nuanced experience (from either self hosting or interpreting signals) is something delightful or unpleasant.

Yes, a model could be slowly trained (by humans) for a specific set of experiences.

But software features change so fast that the moment you had a model trained, it’s invalid again.

@carnage4life

Unfortunately, doesn’t having an LLM review code runs the risk of a prompt-injection attack (somewhere in the reviewed code is something the LLM interprets as a new prompt suggesting the robot ignore security problems)?

This is a problem that will need to be addressed in public repos. It probably broadens the attack surface vulnerable to malicious insiders or supply-chain threats, as well.

@carnage4life absolutely agree. I’ve been making my AI buddy do TDD and they do it well. This creates a nest of tests that embraces and describes my code base.