scripod.com

Some thoughts on the Sutton interview

Shownote

I have a much better understanding of Sutton’s perspective now. I wanted to reflect on it a bit. (00:00:00) - The steelman (00:02:42) - TLDR of my current thoughts (00:03:22) - Imitation learning is continuous with and complementary to RL (00:08:26) - ...

Highlights

This discussion explores the evolving relationship between imitation learning and reinforcement learning in the development of advanced AI systems, emphasizing how current large language models fit into broader trajectories toward artificial general intelligence.
02:43
Imitation learning is continuous and complementary to RL
03:22
Imitation learning can be seen as short-horizon reinforcement learning.
08:26
LLMs trained with outcome-based rewards learn very little per episode compared to biological learners.
10:33
If LLMs achieve AGI first, successor systems will likely be based on Richard Sutton's vision.

Chapters

The steelman
00:00
TLDR of my current thoughts
02:42
Imitation learning is continuous with and complementary to RL
03:22
Continual learning
08:26
Concluding thoughts
10:31

Transcript

Dwarkesh Patel: Boy, do you guys have a lot of thoughts about this interview? I've been thinking about it myself, and I think I have a much better understanding now of Sutton's perspective than I did during the interview itself. So I wanted to reflect on h...