That ChatGPT passes exams is much more a reflection on exams than information about ChatGPT.

#WittgensteinsRuler

GPT-4 and professional benchmarks: the wrong answer to the wrong question

OpenAI may have tested on the training data. Besides, human benchmarks are meaningless for bots.

AI Snake Oil
@adamchainz
"Try this one weird trick to pass every exam"
@nntaleb
@nntaleb lot of exams are really trying to memorize responses. Which is not exactly wrong way to vet knowledge, but it is certainly not most reliable. It is just very easy to mimic.

@nntaleb one problem is the testing process is almost never blind.

Evaluation tends to give benefit of doubt that would not be granted to a human student.

@nntaleb I scored well on the SAT and GRE and passed the CPA without working in the industry, the whole process was just finding the tricks. Tests are so formalized that it’s easy to limit your studying to what will be on the test. For the CPA it was memorize some acronyms and work a few identical problems over and over. For the SAT essay question memorize a few “historical illustrations” that you can work into any topic.