
"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM
4.2.2026
0:00
11:39
Author's note: this is somewhat more rushed than ideal, but I think getting this out sooner is pretty important. Ideally, it would be a bit less snarky.
Anthropic[1] recently published a new piece of research: The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence and Task Complexity? (arXiv, Twitter thread).
I have some complaints about both the paper and the accompanying blog post.
tl;dr
Outline:
(00:39) tl;dr
(01:42) Paper
(06:25) Blog
The original text contained 3 footnotes which were omitted from this narration.
---
First published:
February 4th, 2026
Source:
https://www.lesswrong.com/posts/ceEgAEXcL7cC2Ddiy/anthropic-s-hot-mess-paper-overstates-its-case-and-the-blog
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Anthropic[1] recently published a new piece of research: The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence and Task Complexity? (arXiv, Twitter thread).
I have some complaints about both the paper and the accompanying blog post.
tl;dr
- The paper's abstract says that "in several settings, larger, more capable models are more incoherent than smaller models", but in most settings they are more coherent. This emphasis is even more exaggerated in the blog post and Twitter thread. I think this is pretty misleading.
- The paper's technical definition of "incoherence" is uninteresting[2] and the framing of the paper, blog post, and Twitter thread equivocate with the more normal English-language definition of the term, which is extremely misleading.
- Section 5 of the paper (and to a larger extent the blog post and Twitter) attempt to draw conclusions about future alignment difficulties that are unjustified by the experiment results, and would be unjustified even if the experiment results pointed in the other direction.
- The blog post is substantially LLM-written. I think this [...]
Outline:
(00:39) tl;dr
(01:42) Paper
(06:25) Blog
The original text contained 3 footnotes which were omitted from this narration.
---
First published:
February 4th, 2026
Source:
https://www.lesswrong.com/posts/ceEgAEXcL7cC2Ddiy/anthropic-s-hot-mess-paper-overstates-its-case-and-the-blog
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Weitere Episoden von „LessWrong (Curated & Popular)“



Verpasse keine Episode von “LessWrong (Curated & Popular)” und abonniere ihn in der kostenlosen GetPodcast App.








