Mastodawn

Testing AI systems? The old rules no longer apply.

Non-deterministic systems cannot be assessed using deterministic methods. “Pass/Fail” is too narrow.

At #SwissTestingDay, Mike Mannion reflected on key themes:

➡️ probabilities over binary results
➡️ system behaviour over isolated outputs
➡️ testing as a strategic discipline

He also points to approaches like #PUnit for evaluating such systems.

Very happy to follow Mike on his talk at @jugch on #PUnit to mitigate those flaky tests (esp. in the context of LLMs) #java #community #ai #testing #resilience #llm

If you cannot test it reliably, you cannot scale it responsibly. 💥

Non-deterministic behaviour challenges classical #SoftwareTesting assumptions.

Same input does not always mean same output — especially in #LLM-based systems.

Tomorrow in Basel, Mike Mannion speaks at @jugch about using #PUnit to extend #JUnit with statistical testing approaches.

See you there.

We're in #Kenya this afternoon with #hiphop group #PUnit.

YouTube