Ray Summit 2025: Bringing AI to the Physical World with Chelsea Finn from Physical Intelligence
[HPP] Chelsea FinnNovember 7, 202522 min
25 connections·40 entities in this video→The Challenge of Physical AI
- ⚠️ Bringing robots to end products is extremely difficult, often requiring a dedicated company for each application.
- 🛠️ Each application demands custom hardware, software, movement patterns, and handling countless edge cases from scratch.
- 🎯 Physical Intelligence aims to solve this by developing a general-purpose model for any robot to perform any task, similar to large language models.
Physical Intelligence's Approach
- 💡 Simply scaling data from industrial automation, YouTube, or simulation is insufficient due to lack of diversity, embodiment gaps, or realism.
- 🚀 The core strategy is to scale up real data collected on real robots, often initiated through teleoperation.
- ✅ Early work with Pi Zero demonstrated a single neural network autonomously folding laundry and adapting to other tasks through fine-tuning.
Advancing Generalization and Language Following
- 🏠 Initial models struggled to generalize to new, unseen environments (e.g., different homes, lighting, objects).
- 📊 This was addressed by collecting diverse real-world data from over 100 unique rooms and combining it with other robot data, showing the power of generalist models.
- 💬 Improved language following from 20% to 80% by modifying the architecture to preserve knowledge from pre-trained vision-language models.
Handling Open-Ended Instructions
- 🧩 Explored hierarchical models where a high-level vision-language model interprets open-ended prompts, and a low-level model executes specific commands.
- 🧠 Language models augment robot data to synthetically generate diverse prompts, enabling the high-level model to follow complex instructions.
- ✅ This allows robots to make customized sandwiches, handle interjections, and perform situated corrections based on user input.
Remaining Hurdles in Robotics
- ⏳ Current state-of-the-art robot models often lack memory, leading to repetitive actions or forgetting previous steps.
- ⚡ Achieving human-speed task completion is a significant challenge, as many demonstrations are sped up.
- 📈 There's a critical need to increase reliability beyond 80% success rates for real-world deployment.
Knowledge graph40 entities · 25 connections
How they connect
An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.
Hover · drag to explore
40 entities
Chapters11 moments
Key Moments
Transcript82 segments
Full Transcript
Topics15 themes
What’s Discussed
Artificial Intelligence (AI)RoboticsGeneral-purpose modelsLarge Language Models (LLMs)TeleoperationNeural NetworksFine-tuningGeneralization in roboticsVision-Language ModelsHierarchical modelsOpen-ended promptsReinforcement LearningPhysical AIReal-world dataHardware development
Smart Objects40 · 25 links
Products· 10
Company· 1
Concepts· 26
People· 2
Event· 1