AI's Dark Side Is Only a Nudge Away

23.9.2025

The Quanta Podcast

0:00

24:08

In order to trust machines with important jobs, we need a high level of confidence that they share our values and goals. Recent work shows that this “alignment” can be brittle, superficial, even unstable. In one study, a few training adjustments led a popular chatbot to recommend murder. On this episode, contributing writer Stephen Ornes tells host Samir Patel about what this research reveals.

Audio coda from The National Archives and Records Administration.