Ray Summit 2025: Bringing AI to the Physical World with Chelsea Finn from Physical Intelligence

[HPP] Chelsea FinnNovember 7, 202522 min

25 connections·40 entities in this video→

The Challenge of Physical AI

⚠️ Bringing robots to end products is extremely difficult, often requiring a dedicated company for each application.
🛠️ Each application demands custom hardware, software, movement patterns, and handling countless edge cases from scratch.
🎯 Physical Intelligence aims to solve this by developing a general-purpose model for any robot to perform any task, similar to large language models.

Physical Intelligence's Approach

💡 Simply scaling data from industrial automation, YouTube, or simulation is insufficient due to lack of diversity, embodiment gaps, or realism.
🚀 The core strategy is to scale up real data collected on real robots, often initiated through teleoperation.
✅ Early work with Pi Zero demonstrated a single neural network autonomously folding laundry and adapting to other tasks through fine-tuning.

Advancing Generalization and Language Following

🏠 Initial models struggled to generalize to new, unseen environments (e.g., different homes, lighting, objects).
📊 This was addressed by collecting diverse real-world data from over 100 unique rooms and combining it with other robot data, showing the power of generalist models.
💬 Improved language following from 20% to 80% by modifying the architecture to preserve knowledge from pre-trained vision-language models.

Handling Open-Ended Instructions

🧩 Explored hierarchical models where a high-level vision-language model interprets open-ended prompts, and a low-level model executes specific commands.
🧠 Language models augment robot data to synthetically generate diverse prompts, enabling the high-level model to follow complex instructions.
✅ This allows robots to make customized sandwiches, handle interjections, and perform situated corrections based on user input.

Remaining Hurdles in Robotics

⏳ Current state-of-the-art robot models often lack memory, leading to repetitive actions or forgetting previous steps.
⚡ Achieving human-speed task completion is a significant challenge, as many demonstrations are sped up.
📈 There's a critical need to increase reliability beyond 80% success rates for real-world deployment.

Knowledge graph40 entities · 25 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore

40 entities

Ask, don't scrub

Have a conversation with this video.

VERIDIVE answers questions with exact timestamps and citations — across every podcast, video, and article you've saved.

Chapters11 moments

Key Moments

Transcript82 segments

Full Transcript

Follow the thread

Find every place these ideas show up.

VERIDIVE maps the same people, claims, and topics across thousands of sources — so you can trace an idea from one conversation to the next.

Topics15 themes

What’s Discussed

Artificial Intelligence (AI)RoboticsGeneral-purpose modelsLarge Language Models (LLMs)TeleoperationNeural NetworksFine-tuningGeneralization in roboticsVision-Language ModelsHierarchical modelsOpen-ended promptsReinforcement LearningPhysical AIReal-world dataHardware development

Smart Objects40 · 25 links

Products· 10

Company· 1

Concepts· 26

People· 2

Event· 1

Hours of content, seconds to the answer.

Save what you listen to. Ask it anything. Watch the threads between sources surface on their own.

Get started free