Mastodawn

Arda Kılıçdağı

BullshitBench measures whether AI models challenge nonsensical prompts instead of confidently answering them, created by Peter Gostev.

https://petergpt.github.io/bullshit-benchmark/viewer/index.v2.html

https://github.com/petergpt/bullshit-benchmark

#llm #bench #ai #bullshitbench #benchmark

BullshitBench Viewer