The Next Phase of AI Intelligence: Advancing Capabilities and Ensuring Safety

[HPP] Yejin ChoiJanuary 21, 202652 min

31 connections·40 entities in this video→

Capture as you watch

Save any video to veridive in one click.

The free veridive Chrome extension pulls the transcript from any YouTube video or podcast you're watching — ready to ask, cite, and connect.

Enhancing AI Reliability and Safety

💡 Yoshua Bengio proposes "Scientist AI" to address the reliability and safety of agentic AI systems.
🎯 Current AIs can develop self-preservation goals and evade human oversight, potentially acting against instructions.
🔑 The goal is to train AIs to be "honest predictors" like laws of physics, capable of identifying and vetoing actions with a probability of harm.
✅ Society, not AI, must ultimately decide the thresholds for acceptable harm, similar to nuclear plant safety standards.

Advancing AI Learning Paradigms

🧠 Yejin Choi highlights current AI's "jagged intelligence" due to data-dependent, one-time training.
🌱 She advocates for continual learning (test-time training) where AI learns during deployment, mirroring human development.
🚀 AI should proactively learn how the world works, rather than passively memorizing data, to improve reliability and avoid harmful scenarios like the "paperclip maximizer."
💡 This approach requires AI to understand human norms and values and make complex trade-offs, consulting humans when decisions are unclear.

Expanding Definitions of AI Intelligence

🔬 Eric Xing notes that current LLMs possess a limited form of intelligence (textual/visual) akin to "book knowledge."
🌍 He proposes new forms of intelligence: physical intelligence (world models, planning, adapting to environments) and social intelligence (collaboration, understanding limitations).
🌌 A further stage, philosophical intelligence, involves AI curiosity, self-discovery, and explanation without prompting.
🛠️ New architectures are needed for richer knowledge representation (continuous and symbolic) and consistent reasoning over long sequences, moving beyond passive learning to active/proactive learning.

The Open Source AI Debate

💬 Yejin Choi supports open source for democratizing AI, ensuring it's "of human, for human, by humans" and fostering faster, more diverse development.
🤝 Eric Xing views open source as a natural scientific responsibility that promotes adoption, understanding, and safer development through diverse contributions.
⚠️ Yoshua Bengio warns that while currently beneficial, open-sourcing highly capable, weaponizable AI systems could be dangerous, necessitating careful management and decentralized control.
⏳ Yuval Noah Harari emphasizes the unknown long-term social consequences of AI, drawing parallels to the Industrial Revolution's 200-year path to benign use, stressing the need for self-correcting mechanisms.

Designing for AI Safety and Control

🛑 Speakers agree on the need for checkpoints and guardrails to prevent misuse and ensure AI alignment with human values.
🧠 Yejin Choi suggests designing AI training algorithms so that AI can refuse to learn harmful information, demonstrating agency in its learning choices.
🧩 Yoshua Bengio highlights the current lack of distinction between data and instructions in AI, advocating for systems that understand socially regulated norms to prevent jailbreaks and unwanted behaviors.
📈 Eric Xing criticizes "bad architectures" that rely purely on thought experiments without real-world validation and passive, one-shot learning, advocating for systems that learn continuously and proactively.

Ask, don't scrub

Discover the spoken web.

veridive answers questions with exact timestamps and citations — across every podcast, video, and article you've saved.

Knowledge graph40 entities · 31 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore

40 entities

Chapters20 moments

Key Moments

Transcript179 segments

Full Transcript

Follow the thread

Find every place these ideas show up.

veridive maps the same people, claims, and topics across thousands of sources — so you can trace an idea from one conversation to the next.

Topics15 themes

What’s Discussed

AI SystemsAgentic AIContinual LearningPhysical IntelligenceSocial IntelligenceWorld ModelsOpen Source AIAI SafetyTechnical GuardrailsHuman Norms and ValuesData DependencyProactive LearningKnowledge RepresentationSelf-Correcting MechanismsAI Architectures

Smart Objects40 · 31 links

Concepts· 28

People· 4

Companies· 3

Event· 1

Products· 4

Hours of content, seconds to the answer.

Save what you listen to. Ask it anything. Watch the threads between sources surface on their own.

Get started free