LessWrong (Curated & Popular) podcast

"Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)" by RobertM

0:00
11:39
Manda indietro di 15 secondi
Manda avanti di 15 secondi
Author's note: this is somewhat more rushed than ideal, but I think getting this out sooner is pretty important. Ideally, it would be a bit less snarky.

Anthropic[1] recently published a new piece of research: The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence and Task Complexity? (arXiv, Twitter thread).

I have some complaints about both the paper and the accompanying blog post.

tl;dr

  • The paper's abstract says that "in several settings, larger, more capable models are more incoherent than smaller models", but in most settings they are more coherent. This emphasis is even more exaggerated in the blog post and Twitter thread. I think this is pretty misleading.
  • The paper's technical definition of "incoherence" is uninteresting[2] and the framing of the paper, blog post, and Twitter thread equivocate with the more normal English-language definition of the term, which is extremely misleading.
  • Section 5 of the paper (and to a larger extent the blog post and Twitter) attempt to draw conclusions about future alignment difficulties that are unjustified by the experiment results, and would be unjustified even if the experiment results pointed in the other direction.
  • The blog post is substantially LLM-written. I think this [...]
---

Outline:

(00:39) tl;dr

(01:42) Paper

(06:25) Blog

The original text contained 3 footnotes which were omitted from this narration.

---

First published:
February 4th, 2026

Source:
https://www.lesswrong.com/posts/ceEgAEXcL7cC2Ddiy/anthropic-s-hot-mess-paper-overstates-its-case-and-the-blog

---



Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Altri episodi di "LessWrong (Curated & Popular)"