EU study warns over the shortcomings of AI benchmarking. Paper by EU researchers highlights problems with how AI models are currently measured and urges regulators to signal which benchmarks are trustworthy
"Measuring AI capabilities and risks is a challenge, and benchmarks have been found to promise too much, be easily gamed, and measure the wrong thing"
https://www.euractiv.com/section/tech/news/eu-study-warns-over-the-shortcomings-of-ai-benchmarking/?utm_source=mastodon&utm_medium=dlvr.it
#AI #benchmarking #benchmarks