Last Week in AI podkast

#216 - Grok 4, Project Rainier, Kimi K2

14.07.2025

Last Week in AI

0:00

1:42:10

Our 216th episode with a summary and discussion of last week's big AI news!
Recorded on 07/11/2025

Hosted by Andrey Kurenkov and Jeremie Harris.
Feel free to email us your questions and feedback at [email protected] and/or [email protected]

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.

In this episode:

xAI launches Grok 4 with breakthrough performance across benchmarks, becoming the first true frontier model outside established labs, alongside a $300/month subscription tier
Grok's alignment challenges emerge with antisemitic responses, highlighting the difficulty of steering models toward "truth-seeking" without harmful biases
Perplexity and OpenAI launch AI-powered browsers to compete with Google Chrome, signaling a major shift in how users interact with AI systems
Meta study reveals AI tools actually slow down experienced developers by 20% on complex tasks, contradicting expectations and anecdotal reports of productivity gains

Timestamps + Links:

(00:00:10) Intro / Banter
(00:01:02) News Preview

Tools & Apps

(00:01:59) Elon Musk's xAI launches Grok 4 alongside a $300 monthly subscription | TechCrunch
(00:15:28) Elon Musk’s AI chatbot is suddenly posting antisemitic tropes
(00:29:52) Perplexity launches Comet, an AI-powered web browser | TechCrunch
(00:32:54) OpenAI is reportedly releasing an AI browser in the coming weeks | TechCrunch
(00:33:27) Replit Launches New Feature for its Agent, CEO Calls it ‘Deep Research for Coding’
(00:34:40) Cursor launches a web app to manage AI coding agents
(00:36:07) Cursor apologizes for unclear pricing changes that upset users | TechCrunch

Applications & Business

Projects & Open Source

Research & Advancements

(01:02:14) Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
(01:07:58) Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
(01:13:03) Mitigating Goal Misgeneralization with Minimax Regret
(01:17:01) Correlated Errors in Large Language Models
(01:20:31) What skills does SWE-bench Verified evaluate?

Policy & Safety

See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Więcej odcinków z kanału "Last Week in AI"

Więcej odcinków

Odkrywaj najlepsze podcasty dzięki bezpłatnej aplikacji GetPodcast.

Subskrybuj ulubione podcasty, słuchaj odcinków offline i sprawdzaj najlepsze polecane podcasty.

© radio.de GmbH 2026

MADSACK