The End of GPU Scaling? Compute & The Agent Era — Tim Dettmers (Ai2) & Dan Fu (Together AI)

22.1.2026

The MAD Podcast with Matt Turck

0:00

1:04:06

Will AGI happen soon - or are we running into a wall?

In this episode, I’m joined by Tim Dettmers (Assistant Professor at CMU; Research Scientist at the Allen Institute for AI) and Dan Fu (Assistant Professor at UC San Diego; VP of Kernels at Together AI) to unpack two opposing frameworks from their essays: “Why AGI Will Not Happen” versus “Yes, AGI Will Happen.” Tim argues progress is constrained by physical realities like memory movement and the von Neumann bottleneck; Dan argues we’re still leaving massive performance on the table through utilization, kernels, and systems—and that today’s models are lagging indicators of the newest hardware and clusters.

Then we get practical: agents and the “software singularity.” Dan says agents have already crossed a threshold even for “final boss” work like writing GPU kernels. Tim’s message is blunt: use agents or be left behind. Both emphasize that the leverage comes from how you use them—Dan compares it to managing interns: clear context, task decomposition, and domain judgment, not blind trust.

We close with what to watch in 2026: hardware diversification, the shift toward efficient, specialized small models, and architecture evolution beyond classic Transformers—including state-space approaches already showing up in real systems.

Sources:

Why AGI Will Not Happen - https://timdettmers.com/2025/12/10/why-agi-will-not-happen/

Use Agents or Be Left Behind? A Personal Guide to Automating Your Own Work - https://timdettmers.com/2026/01/13/use-agents-or-be-left-behind/

Yes, AGI Can Happen – A Computational Perspective - https://danfu.org/notes/agi/

The Allen Institute for Artificial Intelligence

Website - https://allenai.org

X/Twitter - https://x.com/allen_ai

Together AI

Website - https://www.together.ai

X/Twitter - https://x.com/togethercompute

Tim Dettmers

Blog - https://timdettmers.com

LinkedIn - https://www.linkedin.com/in/timdettmers/

X/Twitter - https://x.com/Tim_Dettmers

Dan Fu

Blog - https://danfu.org

LinkedIn - https://www.linkedin.com/in/danfu09/

X/Twitter - https://x.com/realDanFu

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

Blog - https://mattturck.com

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) - Intro

(01:06) – Two essays, two frameworks on AGI

(01:34) – Tim’s background: quantization, QLoRA, efficient deep learning

(02:25) – Dan’s background: FlashAttention, kernels, alternative architectures

(03:38) – Defining AGI: what does it mean in practice?

(08:20) – Tim’s case: computation is physical, diminishing returns, memory movement

(11:29) – “GPUs won’t improve meaningfully”: the core claim and why

(16:16) – Dan’s response: utilization headroom (MFU) + “models are lagging indicators”

(22:50) – Pre-training vs post-training (and why product feedback matters)

(25:30) – Convergence: usefulness + diffusion (where impact actually comes from)

(29:50) – Multi-hardware future: NVIDIA, AMD, TPUs, Cerebras, inference chips

(32:16) – Agents: did the “switch flip” yet?

(33:19) – Dan: agents crossed the threshold (kernels as the “final boss”)

(34:51) – Tim: “use agents or be left behind” + beyond coding

(36:58) – “90% of code and text should be written by agents” (how to do it responsibly)

(39:11) – Practical automation for non-coders: what to build and how to start

(43:52) – Dan: managing agents like junior teammates (tools, guardrails, leverage)

(48:14) – Education and training: learning in an agent world

(52:44) – What Tim is building next (open-source coding agent; private repo specialization)

(54:44) – What Dan is building next (inference efficiency, cost, performance)

(55:58) – Mega-kernels + Together Atlas (speculative decoding + adaptive speedups)

(58:19) – Predictions for 2026: small models, open-source, hardware, modalities

(1:02:02) – Beyond transformers: state-space and architecture diversity

(1:03:34) – Wrap

Flere episoder fra "The MAD Podcast with Matt Turck"

Flere episoder

Få adgang til hele det store podcastunivers med gratisappen GetPodcast.

Abonnér på dine favoritpodcasts, lyt til episoder offline, og få spændende anbefalinger.

En virksomhed fra

The End of GPU Scaling? Compute & The Agent Era — Tim Dettmers (Ai2) & Dan Fu (Together AI)

The MAD Podcast with Matt Turck

Flere episoder fra "The MAD Podcast with Matt Turck"

Everything Gets Rebuilt: The New AI Agent Stack | Harrison Chase, LangChain

AI That Can Prove It’s Right: Verification as the Missing Layer in AI — Carina Hong

Voice AI’s Big Moment: Why Everything Is Changing Now (ft. Neil Zeghidour, Gradium AI)

Mistral AI vs. Silicon Valley: The Rise of Sovereign AI

Dylan Patel: NVIDIA's New Moat & Why China is "Semiconductor Pilled”

State of LLMs 2026: RLVR, GRPO, Inference Scaling — Sebastian Raschka

The End of GPU Scaling? Compute & The Agent Era — Tim Dettmers (Ai2) & Dan Fu (Together AI)

The Evaluators Are Being Evaluated — Pavel Izmailov (Anthropic/NYU)

DeepMind Gemini 3 Lead: What Comes After "Infinite Data"

What’s Next for AI? OpenAI’s Łukasz Kaiser (Transformer Co-Author)