"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

4.2.2026

LessWrong (Curated & Popular)

0:00

11:39

Author's note: this is somewhat more rushed than ideal, but I think getting this out sooner is pretty important. Ideally, it would be a bit less snarky.

Anthropic[1] recently published a new piece of research: The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence and Task Complexity? (arXiv, Twitter thread).

I have some complaints about both the paper and the accompanying blog post.

tl;dr

The paper's abstract says that "in several settings, larger, more capable models are more incoherent than smaller models", but in most settings they are more coherent. This emphasis is even more exaggerated in the blog post and Twitter thread. I think this is pretty misleading.
The paper's technical definition of "incoherence" is uninteresting[2] and the framing of the paper, blog post, and Twitter thread equivocate with the more normal English-language definition of the term, which is extremely misleading.
Section 5 of the paper (and to a larger extent the blog post and Twitter) attempt to draw conclusions about future alignment difficulties that are unjustified by the experiment results, and would be unjustified even if the experiment results pointed in the other direction.
The blog post is substantially LLM-written. I think this [...]

---

Outline:

(00:39) tl;dr

(01:42) Paper

(06:25) Blog

The original text contained 3 footnotes which were omitted from this narration.

---

First published:
February 4th, 2026

Source:
https://www.lesswrong.com/posts/ceEgAEXcL7cC2Ddiy/anthropic-s-hot-mess-paper-overstates-its-case-and-the-blog

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Weitere Episoden von „LessWrong (Curated & Popular)“

Weitere Episoden

Hol dir die ganze Welt der Podcasts mit der kostenlosen GetPodcast App.

Abonniere alle deine Lieblingspodcasts, höre Episoden auch offline und erhalte passende Empfehlungen für Podcasts, die dich wirklich interessieren.

Ein Unternehmen von

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

LessWrong (Curated & Popular)

Weitere Episoden von „LessWrong (Curated & Popular)“

"IABIED Book Review: Core Arguments and Counterarguments" by Stephen McAleese

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

"Conditional Kickstarter for the “Don’t Build It” March" by Raemon

"How to Hire a Team" by Gretta Duleba

"The Possessed Machines (summary)" by L Rudolf L

"Ada Palmer: Inventing the Renaissance" by Martin Sustrik

"AI found 12 of 12 OpenSSL zero-days (while curl cancelled its bug bounty)" by Stanislav Fort

"Dario Amodei – The Adolescence of Technology" by habryka

"AlgZoo: uninterpreted models with fewer than 1,500 parameters" by Jacob_Hilton

"Does Pentagon Pizza Theory Work?" by rba