Ex-OpenAI Researcher On Why He Left, His Honest AGI Timeline, & The Limits of Scaling RL

[HPP] Jerry TworekJanuary 29, 20261h 3min

46 connections·40 entities in this video→

Capture as you watch

Save any video to veridive in one click.

The free veridive Chrome extension pulls the transcript from any YouTube video or podcast you're watching — ready to ask, cite, and connect.

Scaling AI Paradigms

🚀 Scaling pre-training and reinforcement learning (RL) offers real and predictable benefits, leading to models that excel at specific tasks like next token prediction or acquiring particular skills.
⚠️ A primary limitation of current scaling methods is the models' struggle with generalization beyond what they have been explicitly trained for, highlighting a need for new approaches that yield more results with less data.

Challenges in Reinforcement Learning

🎯 RL is highly effective when there's clear and immediate feedback on performance, such as in coding or math competitions, where success or failure is easily quantifiable.
📚 It becomes significantly harder to apply RL to subjective or long-term tasks like writing a good book or starting a successful company, where feedback is ambiguous, delayed, or difficult to measure consistently.

Rethinking AGI and Continual Learning

🧠 The speaker's AGI timeline became longer after working with scaled RL, realizing that current models become "hopeless" when they fail and lack the ability to update their internal knowledge based on errors.
🌱 Continual learning is identified as a necessary element for AGI, enabling models to work through difficulties autonomously and get unstuck, similar to human intelligence.
🛠️ Achieving continual learning requires fundamental robustness in the training process, as current deep learning models are fragile and prone to

Ask, don't scrub

Discover the spoken web.

veridive answers questions with exact timestamps and citations — across every podcast, video, and article you've saved.

Knowledge graph40 entities · 46 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore

40 entities

Chapters19 moments

Key Moments

Transcript232 segments

Full Transcript

Follow the thread

Find every place these ideas show up.

veridive maps the same people, claims, and topics across thousands of sources — so you can trace an idea from one conversation to the next.

Topics15 themes

What’s Discussed

Reinforcement Learning (RL)Pre-trainingGeneralization (in AI models)Artificial General Intelligence (AGI)Continual LearningDeep Learning ModelsOpenAIChatGPTGPT-4Reasoning Models (O1, O3)CodexAI CodingRoboticsWork AutomationAI Research

Smart Objects40 · 46 links

Companies· 6

People· 4

Medias· 3

Concepts· 22

Products· 5

Hours of content, seconds to the answer.

Save what you listen to. Ask it anything. Watch the threads between sources surface on their own.

Get started free