The MAD Podcast with Matt Turck podcast

The End of GPU Scaling? Compute & The Agent Era — Tim Dettmers (Ai2) & Dan Fu (Together AI)

0:00
1:04:06
Spol 15 sekunder tilbage
Spol 15 sekunder frem

Will AGI happen soon - or are we running into a wall?


In this episode, I’m joined by Tim Dettmers (Assistant Professor at CMU; Research Scientist at the Allen Institute for AI) and Dan Fu (Assistant Professor at UC San Diego; VP of Kernels at Together AI) to unpack two opposing frameworks from their essays: “Why AGI Will Not Happen” versus “Yes, AGI Will Happen.” Tim argues progress is constrained by physical realities like memory movement and the von Neumann bottleneck; Dan argues we’re still leaving massive performance on the table through utilization, kernels, and systems—and that today’s models are lagging indicators of the newest hardware and clusters.


Then we get practical: agents and the “software singularity.” Dan says agents have already crossed a threshold even for “final boss” work like writing GPU kernels. Tim’s message is blunt: use agents or be left behind. Both emphasize that the leverage comes from how you use them—Dan compares it to managing interns: clear context, task decomposition, and domain judgment, not blind trust.


We close with what to watch in 2026: hardware diversification, the shift toward efficient, specialized small models, and architecture evolution beyond classic Transformers—including state-space approaches already showing up in real systems.


Sources:

Why AGI Will Not Happen - https://timdettmers.com/2025/12/10/why-agi-will-not-happen/

Use Agents or Be Left Behind? A Personal Guide to Automating Your Own Work - https://timdettmers.com/2026/01/13/use-agents-or-be-left-behind/

Yes, AGI Can Happen – A Computational Perspective - https://danfu.org/notes/agi/


The Allen Institute for Artificial Intelligence

Website - https://allenai.org

X/Twitter - https://x.com/allen_ai


Together AI

Website - https://www.together.ai

X/Twitter - https://x.com/togethercompute


Tim Dettmers

Blog - https://timdettmers.com

LinkedIn - https://www.linkedin.com/in/timdettmers/

X/Twitter - https://x.com/Tim_Dettmers


Dan Fu

Blog - https://danfu.org

LinkedIn - https://www.linkedin.com/in/danfu09/

X/Twitter - https://x.com/realDanFu


FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap


Matt Turck (Managing Director)

Blog - https://mattturck.com

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck


(00:00) - Intro

(01:06) – Two essays, two frameworks on AGI

(01:34) – Tim’s background: quantization, QLoRA, efficient deep learning

(02:25) – Dan’s background: FlashAttention, kernels, alternative architectures

(03:38) – Defining AGI: what does it mean in practice?

(08:20) – Tim’s case: computation is physical, diminishing returns, memory movement

(11:29) – “GPUs won’t improve meaningfully”: the core claim and why

(16:16) – Dan’s response: utilization headroom (MFU) + “models are lagging indicators”

(22:50) – Pre-training vs post-training (and why product feedback matters)

(25:30) – Convergence: usefulness + diffusion (where impact actually comes from)

(29:50) – Multi-hardware future: NVIDIA, AMD, TPUs, Cerebras, inference chips

(32:16) – Agents: did the “switch flip” yet?

(33:19) – Dan: agents crossed the threshold (kernels as the “final boss”)

(34:51) – Tim: “use agents or be left behind” + beyond coding

(36:58) – “90% of code and text should be written by agents” (how to do it responsibly)

(39:11) – Practical automation for non-coders: what to build and how to start

(43:52) – Dan: managing agents like junior teammates (tools, guardrails, leverage)

(48:14) – Education and training: learning in an agent world

(52:44) – What Tim is building next (open-source coding agent; private repo specialization)

(54:44) – What Dan is building next (inference efficiency, cost, performance)

(55:58) – Mega-kernels + Together Atlas (speculative decoding + adaptive speedups)

(58:19) – Predictions for 2026: small models, open-source, hardware, modalities

(1:02:02) – Beyond transformers: state-space and architecture diversity

(1:03:34) – Wrap

Flere episoder fra "The MAD Podcast with Matt Turck"