Lately, I see a lot of posts saying we need a robust suite of automated tests to ensure AI agents produce quality results.

Keep in mind that one cannot test quality into software. Automated tests are a necessity, but they are not sufficient as a sole approach.

I wouldn't ship a feature without first running an exploratory testing session. So many things can go wrong that we can't specify explicitly; only exploratory testing finds these.

And I want a dry run on difficult data migrations, too.