What Makes a Good AI Benchmark?

Graphic showing the first page of the policy brief

This brief from the Stanford Institute for Human-Centered Artificial Intelligence presents a novel assessment framework for evaluating the quality of AI benchmarks and scores 24 benchmarks against the framework. This work aims to offer practical insights and proposes a rigorous framework that empowers developers to assess and enhance benchmark quality.