Robot Learning: Data Scaling and Policy Improvement
[HPP] Pieter AbbeelOctober 17, 202554 min
24 connections·40 entities in this video→The Challenge of Robot Deployment
- 🤖 Traditional robot learning pipelines focus on data collection and policy training, but often treat deployment merely as an evaluation phase.
- ⚠️ Current deployment methods, relying on manual supervision by PhD students, are not scalable for industrial applications.
- 💡 The speaker emphasizes that deployment is a crucial data generation process, where every robot trajectory, successful or not, provides valuable information for policy improvement.
Amada: Human-in-the-Loop Policy Improvement
- 👨💻 The Amada framework introduces a human-in-the-loop system to enable scalable real-world deployment and adaptation.
- 🚨 It features Float, an autonomous online failure detection module based on optimal transport (OT cost), which identifies errors and provides early warnings.
- 🔄 An adaptive rewinding mechanism resets the robot to the state before a failure, allowing operators to provide corrective human interventions and collect high-quality data.
- 📈 Experiments with a multi-robot factory setup show Amada consistently improves policy performance and generalization, while significantly reducing human intervention rates over time.
SOUL: Autonomous Policy Self-Improvement
- 🚀 To achieve greater automation, the SOUL framework focuses on robot policy self-improvement through efficient exploration in the real world.
- 📉 Traditional diffusion policies often suffer from mode collapse, generating repetitive failures, and standard action-level exploration can lead to jerky, unsafe motions.
- 🧭 SOUL proposes manifold exploration, constraining exploration to the task manifold to generate diverse yet smooth and valid actions.
- 🧠 An information bottleneck creates a well-shaped latent space, ensuring exploration focuses on task-relevant factors and improves sample efficiency.
- ✅ The system demonstrates higher success rates and smoother motions compared to previous methods, with the ability for human-guided exploration without teleoperation.
Knowledge graph40 entities · 24 connections
How they connect
An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.
Hover · drag to explore
40 entities
Chapters18 moments
Key Moments
Transcript196 segments
Full Transcript
Topics15 themes
What’s Discussed
Robot learningData scalingPolicy improvementImitation learningPolicy deploymentHuman-in-the-loop systemsAutonomous failure detectionOptimal transportMulti-robot systemsPolicy generalizationOnline explorationManifold explorationDiffusion policyLatent spaceSample efficiency
Smart Objects40 · 24 links
People· 4
Concepts· 17
Location· 1
Medias· 2
Companies· 9
Products· 5
Events· 2