
“Proposal for making credible commitments to AIs.” by Cleo Nardo
30/06/2025
0:00
5:19
Acknowledgments: The core scheme here was suggested by Prof. Gabriel Weil.
There has been growing interest in the deal-making agenda: humans make deals with AIs (misaligned but lacking decisive strategic advantage) where they promise to be safe and useful for some fixed term (e.g. 2026-2028) and we promise to compensate them in the future, conditional on (i) verifying the AIs were compliant, and (ii) verifying the AIs would spend the resources in an acceptable way.[1]
I think the deal-making agenda breaks down into two main subproblems:
Here is my current best assessment of how we can make credible commitments to AIs.
[...]
The original text contained 2 footnotes which were omitted from this narration.
---
First published:
June 27th, 2025
Source:
https://www.lesswrong.com/posts/vxfEtbCwmZKu9hiNr/proposal-for-making-credible-commitments-to-ais
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
There has been growing interest in the deal-making agenda: humans make deals with AIs (misaligned but lacking decisive strategic advantage) where they promise to be safe and useful for some fixed term (e.g. 2026-2028) and we promise to compensate them in the future, conditional on (i) verifying the AIs were compliant, and (ii) verifying the AIs would spend the resources in an acceptable way.[1]
I think the deal-making agenda breaks down into two main subproblems:
- How can we make credible commitments to AIs?
- Would credible commitments motivate an AI to be safe and useful?
Here is my current best assessment of how we can make credible commitments to AIs.
[...]
The original text contained 2 footnotes which were omitted from this narration.
---
First published:
June 27th, 2025
Source:
https://www.lesswrong.com/posts/vxfEtbCwmZKu9hiNr/proposal-for-making-credible-commitments-to-ais
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
D'autres épisodes de "LessWrong (Curated & Popular)"
Ne ratez aucun épisode de “LessWrong (Curated & Popular)” et abonnez-vous gratuitement à ce podcast dans l'application GetPodcast.