Talking law and economics at ETH Zurich podcast

Off-the-Shelf Large Language Models Are Unreliable Judges – Jonathan Choi (USC / WashU)

01/03/2026

Talking law and economics at ETH Zurich

0:00

14:35

With the rapid rise of artificial intelligence, large language models (LLMs) are increasingly being considered for tasks once thought to be uniquely human—including legal interpretation. The idea of “AI judges” suggests appealing possibilities: consistent, fast, and ostensibly unbiased answers to legal questions. But how reliable are these models? Can their judgments truly be trusted? And do they withstand careful empirical scrutiny?

In this episode of the CLE Vlog Series, Prof. Jonathan Choi (University of Southern California & Washington University, St. Louis) joins Alessandro Tacconelli (ETH Zurich) to discuss his paper, “Off-the-Shelf Large Language Models Are Unreliable Judges.” Prof. Choi presents findings from a series of empirical experiments designed to test how well LLMs perform as legal interpreters. His results reveal that model judgments are highly sensitive to prompt phrasing, output processing methods, and training choices. Moreover, post-training adjustments in today’s most widely used models can push LLMs’ assessments far from empirically grounded predictions of language use. These insights raise serious questions about the credibility of LLMs in legal interpretation and cast doubt on their ability to capture the “ordinary meaning” of legal texts.

Paper Reference:

Jonathan Choi – University of Southern California / Washington University (St. Louis)

Large Language Models Are Unreliable Judges

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5188865

Audio Credits for Trailer:

AllttA by AllttA

https://youtu.be/ZawLOcbQZ2w

Mais episódios de "Talking law and economics at ETH Zurich"

Mais episódios

Descobre o mundo dos podcasts com a app gratuita GetPodcast.

Subscreve os teus podcasts preferidos, ouve episódios offline e obtém recomendações fantásticas.

Uma empresa de

Off-the-Shelf Large Language Models Are Unreliable Judges – Jonathan Choi (USC / WashU)

Talking law and economics at ETH Zurich

Mais episódios de "Talking law and economics at ETH Zurich"

Off-the-Shelf Large Language Models Are Unreliable Judges – Jonathan Choi (USC / WashU)

Symmetry, Presumptions and Judges Design – Murat Mungan (Texas A&M)

Prediction without Preclusion – Berk Ustun (UC San Diego)

Swiss Cheese Contracts – Mitu Gulati (University of Virginia)

The False Choice Between Digital Regulation and Innovation – Anu Bradford (Columbia University)

Rewarding Failure – Francesco Parisi (Universities of Minnesota & Bologna)

The Rise of Nonbanks in Servicing Household Debt – Manisha Padi (UC Berkeley)

An Expert-Sourced Measure of Judicial Ideology – Kevin Cope (University of Virginia)

Work Requirements and Child Tax Benefits – Jacob Goldin (University of Chicago)

Law and the New Dynamic Public Finance – Prof. Daniel Hemel (New York University)