
As AI systems move into areas like transport, healthcare, finance, and policing, regulators want proof they are safe. The simplest way is to set clear metrics: crashes per million miles, error rates per thousand decisions, false arrests prevented. Numbers are neat, trackable, and hold companies accountable.
But here’s the catch. Once a number becomes the target, systems learn to hit it in ways that don’t always mean real safety. This is Goodhart’s law — “when a measure becomes a target, it ceases to be a good measure.” A self-driving car might avoid reporting certain incidents, or a diagnostic AI might over-treat just to keep its error rate low.
If regulators wait to act until the harms are clearer, they fall into the Collingridge dilemma: by the time we understand the risks well enough to design better rules, the technology is already entrenched and harder to shape. Act too early, and we freeze progress with crude or irrelevant rules.
The conundrum:Do we anchor AI safety in hard numbers that can be gamed but at least force accountability, or in flexible principles that capture real intent but are so vague they may stall progress and get politicized? And if both paths carry failure baked in, is the deeper trap that any attempt to govern AI will either ossify too soon or drift into loopholes too late?
D'autres épisodes de "The Daily AI Show"
Ne ratez aucun épisode de “The Daily AI Show” et abonnez-vous gratuitement à ce podcast dans l'application GetPodcast.