AI benchmarks are broken and the industry keeps using them anyway, study finds
Benchmarks are supposed to measure AI model performance objectively. But according to an analysis by Epoch AI, results depend heavily on how the test is run. The research organization identifies numerous variables that are rarely disclosed but significantly affect outcomes.