Mastodawn

I tasked an AI agent with the implementation of an algorithm from a research paper. 15 minutes later: clean code, green tests, plausible visualizations. Hours later: I'm still not sure if it's correct.

What happens when AI generates code faster than you can understand the domain?

https://phpunit.expert/articles/faster-than-understanding.html?ref=mastodon

Faster than understanding

An AI coding agent implemented a complex software metric in 15 minutes. I have now spent hours trying to figure out whether the implementation is correct. Is this really a productivity boost?

phpunit.expert

Show thread

Christian Wolf Mar 18

@sebastian FYI Image in article gives me a 403 allthough I can see the preview here in mastodon...

Show thread

Sebastian Bergmann

Mar 18

@chaos0815 Thank you for the bug report!

Show thread

Christian Wolf Mar 18

@sebastian I guess AI is great for software that does obvious, visually verifiable things. Apps and such, tying together libs that were carefully handcrafted.

Show thread

Djumaka Mar 18

@sebastian Good takea nd this is, I guess where the separation on pro/anti-AI and vibe coding goes - whether you need the software to be precise and whether you can manually validate the produced logic on an acceptable level. Vibe-coding of sites and small systems is a complete win and on the other side are old entangled domains and legacy software that domain knowledge is literally held in people's memory.

Show thread

Felix Neumann Mar 19

@sebastian Very important problem indeed. What I couldn't read from your article is this: did you try to use AI to verify the implementation?

I can think of multiple ways how tools like Claude Code etc. can help you gain confidence in the implementation -- or to find flaws.

(1) Ask it for a written proof (whatever kind of "proof" it will generate, maybe you can work with it).

(2) Ask it for a written walkthrough. @simon built a tool for that: https://simonwillison.net/guides/agentic-engineering-patterns/linear-walkthroughs/

(3) Go interactive, enter a conversation with your agent, guide it through your algorithm, and let you show step by step how the implementation matches it.

This kind of problem will be very important in future, so I'm looking forward to any insights here!

Linear walkthroughs - Agentic Engineering Patterns

Simon Willison’s Weblog