Testing AI systems? The old rules no longer apply.

Non-deterministic systems cannot be assessed using deterministic methods. “Pass/Fail” is too narrow.

At #SwissTestingDay, Mike Mannion reflected on key themes:

➡️ probabilities over binary results
➡️ system behaviour over isolated outputs
➡️ testing as a strategic discipline

He also points to approaches like #PUnit for evaluating such systems.

Conference insights: https://dev.karakun.com/2026/04/02/Swiss-Testing-Day.html

#AITesting #SoftwareEngineering #QualityEngineering

Very happy to follow Mike on his talk at @jugch on #PUnit to mitigate those flaky tests (esp. in the context of LLMs) #java #community #ai #testing #resilience #llm

If you cannot test it reliably, you cannot scale it responsibly. 💥

Non-deterministic behaviour challenges classical #SoftwareTesting assumptions.

Same input does not always mean same output — especially in #LLM-based systems.

Tomorrow in Basel, Mike Mannion speaks at @jugch about using #PUnit to extend #JUnit with statistical testing approaches.

📍 03 March | 18:15 | Markthalle Basel
https://www.jug.ch/html/events/2026/testing-with-punit.html

See you there.

#SoftwareTesting #EngineeringExcellence

WekaWeka by P Unit (the official Music Video!) #'WekaWeka #Wagengehaotena

YouTube