https://www.linkedin.com/feed/update/urn:li:activity:7441018458999418880/

AI Limit Testing: Challenging Models to Their Limits | Peter Lawrey posted on the topic | LinkedIn
TL;DR I don't just expect AI to fail regularly; if they don't, I make the problem harder. These days, software development is more about problem framing and intent definition, as humans still need to decide which problems are worth solving, what “good” means, which constraints matter, and which trade-offs are acceptable. However, I don't trust guardrails that let AI check itself. Instead, much of my focus is on creating linters that are tested and validated themselves and provide the guardrails; otherwise, the rules you specify in markdown files are far too easy to ignore or game. Put another way, if an AI reliably follows your instructions, you are probably not pushing it to its limit and seeing what it really can do. These are the sort of tasks that will be fully automated in the next year or so. What I look for are tasks or problems with a 20%-80% failure rate that require many checks to provide any confidence in the result. These are the sort of tasks where a human will still be needed for the foreseeable future. | 44 comments on LinkedIn
