BullshitBench measures whether AI models challenge nonsensical prompts instead of confidently answering them, created by Peter Gostev.
https://petergpt.github.io/bullshit-benchmark/viewer/index.v2.html
BullshitBench measures whether AI models challenge nonsensical prompts instead of confidently answering them, created by Peter Gostev.
https://petergpt.github.io/bullshit-benchmark/viewer/index.v2.html