Skip to main content

Ray Summit 2025: Bringing AI to the Physical World with Chelsea Finn from Physical Intelligence

[HPP] Chelsea FinnNovember 7, 202522 min
25 connections·40 entities in this video

The Challenge of Physical AI

  • ⚠️ Bringing robots to end products is extremely difficult, often requiring a dedicated company for each application.
  • 🛠️ Each application demands custom hardware, software, movement patterns, and handling countless edge cases from scratch.
  • 🎯 Physical Intelligence aims to solve this by developing a general-purpose model for any robot to perform any task, similar to large language models.

Physical Intelligence's Approach

  • 💡 Simply scaling data from industrial automation, YouTube, or simulation is insufficient due to lack of diversity, embodiment gaps, or realism.
  • 🚀 The core strategy is to scale up real data collected on real robots, often initiated through teleoperation.
  • ✅ Early work with Pi Zero demonstrated a single neural network autonomously folding laundry and adapting to other tasks through fine-tuning.

Advancing Generalization and Language Following

  • 🏠 Initial models struggled to generalize to new, unseen environments (e.g., different homes, lighting, objects).
  • 📊 This was addressed by collecting diverse real-world data from over 100 unique rooms and combining it with other robot data, showing the power of generalist models.
  • 💬 Improved language following from 20% to 80% by modifying the architecture to preserve knowledge from pre-trained vision-language models.

Handling Open-Ended Instructions

  • 🧩 Explored hierarchical models where a high-level vision-language model interprets open-ended prompts, and a low-level model executes specific commands.
  • 🧠 Language models augment robot data to synthetically generate diverse prompts, enabling the high-level model to follow complex instructions.
  • ✅ This allows robots to make customized sandwiches, handle interjections, and perform situated corrections based on user input.

Remaining Hurdles in Robotics

  • ⏳ Current state-of-the-art robot models often lack memory, leading to repetitive actions or forgetting previous steps.
  • ⚡ Achieving human-speed task completion is a significant challenge, as many demonstrations are sped up.
  • 📈 There's a critical need to increase reliability beyond 80% success rates for real-world deployment.
Knowledge graph40 entities · 25 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore
40 entities
Chapters11 moments

Key Moments

Transcript82 segments

Full Transcript

Topics15 themes

What’s Discussed

Artificial Intelligence (AI)RoboticsGeneral-purpose modelsLarge Language Models (LLMs)TeleoperationNeural NetworksFine-tuningGeneralization in roboticsVision-Language ModelsHierarchical modelsOpen-ended promptsReinforcement LearningPhysical AIReal-world dataHardware development
Smart Objects40 · 25 links
Products· 10
Company· 1
Concepts· 26
People· 2
Event· 1