Some thoughts on the Sutton interview

Dwarkesh Podcast

Oct 04

Overview Shownote Highlights Transcript Chapters Pins

Shownote

I have a much better understanding of Sutton’s perspective now. I wanted to reflect on it a bit. (00:00:00) - The steelman (00:02:42) - TLDR of my current thoughts (00:03:22) - Imitation learning is continuous with and complementary to RL (00:08:26) - ...

Highlights

This discussion explores the evolving relationship between imitation learning and reinforcement learning in the development of advanced AI systems, emphasizing how current large language models fit into broader trajectories toward artificial general intelligence.

02:43

Imitation learning is continuous and complementary to RL

03:22

Imitation learning can be seen as short-horizon reinforcement learning.

08:26

LLMs trained with outcome-based rewards learn very little per episode compared to biological learners.

10:33

If LLMs achieve AGI first, successor systems will likely be based on Richard Sutton's vision.

Chapters

The steelman