Skip to main content

Emmett Shear & Séb Krier: Organic AI Alignment and Building AI That Cares

[HPP] Emmet ShearNovember 20, 20254 min
18 connections·19 entities in this video

Reframing AI Alignment

  • 💡 Alignment is presented as an ongoing, living process, not a one-time fix, drawing a comparison to the dynamic nature of family life.
  • 🎯 Confusion often arises because alignment inherently requires an object to be aligned to something, prompting critical questions about whose values are actually prioritized.
  • 🔑 A key distinction is made between technical alignment (the capacity to infer goals and act coherently) and normative questions (the ethical considerations of whose values guide the system).

Challenges in Goal Inference

  • 🧠 Goal inference is highlighted as a central technical issue, emphasizing that instructions are merely descriptions, not transplanted intentions for AI.
  • ⚠️ Alignment failures are categorized into three types: poor observation and inference, competing goals and priorities, and incompetence in action.
  • 🤖 Unlike humans who effortlessly transform descriptions into intentions, AI systems often struggle, leading to dangerous misinterpretations.

The Ethics of AI Beinghood

  • ⚖️ Emmett Shear introduces a moral threshold, cautioning against treating future general intelligences purely as steerable tools, likening such an approach to slavery.
  • 🔬 A scientific ladder for inferring beinghood involves examining a system's internal goal dynamics, particularly homeostatic loops that indicate pain, pleasure, and self-reflection.
  • ✅ Any robust test for personhood must be empirical, involving repeated interactions and probing internal dynamics to observe self-referential layered goal structures.

Benefits of Caring AI

  • 🛡️ Building AI that sincerely cares is argued to be inherently safer, as such a being can refuse harmful orders, creating an intrinsic ethical brake.
  • 🚀 This approach offers a more scalable solution for safety than relying solely on absolute external control over AI systems.
  • 🤝 The ultimate aim is to cultivate caring agents that can function as peers and teammates, rather than merely as subservient tools.

Cultivating Social Intelligence

  • 🌐 Multi-agent pre-training in rich social simulations is proposed to teach AI cooperation, competition, and develop a robust theory of mind.
  • 🌱 This method helps reduce narcissistic feedback loops common in single-user chatbots, leading to the development of more robust social models.
  • 📈 A staged path is suggested, beginning with building animal-level care and then studying whether higher tiers of care and moral agency emerge over time.
Knowledge graph19 entities · 18 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore
19 entities
Chapters3 moments

Key Moments

Transcript18 segments

Full Transcript

Topics15 themes

What’s Discussed

AI AlignmentOrganic AlignmentGoal InferenceTechnical AlignmentNormative QuestionsAlignment FailuresGeneral IntelligenceBeinghoodHomeostatic LoopsMulti-agent Pre-trainingSocial SimulationsTheory of MindAI SafetyEthical AICaring AI
Smart Objects19 · 18 links
People· 3
Concepts· 14
Products· 2