Software Engineering Radio - the podcast for professional software developers podcast

SE Radio 703: Sahaj Garg on Low Latency AI

14/01/2026

Software Engineering Radio - the podcast for professional software developers

0:00

54:47

In this episode, Sahaj Garg, CTO of wispr.ai, joins SE Radio host Robert Blumen to talk about the challenges of building low-latency AI applications. They discuss latency's effect on consumer behavior as well as interactive applications. The conversation explores how to measure latency and how scale impacts it. Then Sahaj and Robert shift to themes around AI, including whether "AI" means LLMs or something broader, as they look at latency requirements and challenges around subtypes of AI applications. The final part of the episode explores techniques for managing latency in AI: speed vs accuracy trade-offs; speed vs cost; latency vs cost; choosing the right model; reducing quantization; distillation; and guessing + validating.

Brought to you by IEEE Computer Society and IEEE Software magazine.

Mais episódios de "Software Engineering Radio - the podcast for professional software developers"

Mais episódios

Descobre o mundo dos podcasts com a app gratuita GetPodcast.

Subscreve os teus podcasts preferidos, ouve episódios offline e obtém recomendações fantásticas.

Uma empresa de

SE Radio 703: Sahaj Garg on Low Latency AI

Software Engineering Radio - the podcast for professional software developers

Mais episódios de "Software Engineering Radio - the podcast for professional software developers"

SE Radio 711: Scott Hanselman on AI-Assisted Development Tools

SE Radio 710: Marc Brooker on Spec-Driven AI Dev

SE Radio 709: Bryan Cantrill on the Data Center Control Plane

SE Radio 708: Jens Gustedt on C in 2026

SE Radio 707: Subhajit Paul on ERP Automation and AI

SE Radio 706: Yechezkel "Chez" Rabinovich on Observability Tool Migration Techniques

SE Radio 705: Murat Erder and Eoin Woods on Continuous Architecture

SE Radio 704: Sriram Panyam on System Design Interviews

SE Radio 703: Sahaj Garg on Low Latency AI

SE Radio 702: Derick Schaefer on Modern CLIs