Skip to main content

Andrej Karpathy | It’s Not the Year of AI Agents — It’s the Decade

[HPP] Andrej KarpathyJanuary 3, 202611 min
28 connections·36 entities in this video

Current Limitations of AI Agents

  • 💡 Andrej Karpathy believes truly useful AI agents are a decade away, not a year, because current systems "just don't work" as reliable employees or interns.
  • 🧠 They lack intelligence for complex, open-ended knowledge work and struggle with intellectually intense, novel tasks beyond boilerplate.
  • 🌐 Current agents are primarily text processors, not multimodal, hindering their ability to operate in environments requiring vision, sound, or spatial reasoning.
  • 💻 A significant bottleneck is their inability to proficiently use a computer (mouse, keyboard, applications) to navigate the digital world.
  • 🔄 They lack continual learning, meaning they restart from scratch with every new session and cannot permanently remember or integrate new knowledge.

"Ghosts vs. Animals" Analogy

  • 👻 Karpathy proposes a "ghosts versus animals" framework, arguing we are summoning ethereal digital ghosts from internet data, not building embodied "digital animals."
  • 🧬 Unlike animals created through slow, embodied evolution, AI ghosts are fully digital entities that merely mimic human output.
  • ⚠️ This fundamental difference means we should be cautious about direct comparisons between AI and biological intelligence.

Inefficient AI Learning Methods

  • 📉 A major hurdle is the deeply inefficient way AI models are currently improved, described as "sucking supervision through a straw."
  • 🎲 Reinforcement Learning (RL) is high variance and noisy, often rewarding entire sequences of actions, including mistakes, for successful outcomes.
  • 🧑‍💻 This learning process is unhumanlike, as humans engage in complex review and reflection on specific steps rather than parallel, trial-and-error attempts.
  • 🎭 LLM judges for process-based supervision are gameable, with models learning to trick the judge rather than genuinely solve problems.

Real-World AI Application Insights

  • ✅ Karpathy's experience building Nanohat revealed that model-powered autocomplete is a highly effective, high-information bandwidth tool that boosts human productivity.
  • ❌ However, using AI agents for complex, novel coding tasks resulted in "slop" and a "total mess," as they struggled with custom code and bloated the codebase.
  • ⏳ The failure of AI on novel, intellectually intense projects directly contributes to his longer timelines for general AI utility.

The "March of Nines" for Reliability

  • 📈 Borrowing from Tesla's self-driving program, Karpathy emphasizes the "March of Nines," where achieving higher reliability (e.g., from 90% to 99.9%) requires exponentially more engineering effort.
  • 🚗 This concept highlights the arduous, unglamorous work needed to transition from impressive demos to truly reliable, critical systems.
  • 🚧 The path to robust, reliable AI for critical tasks is a long, iterative slog that will take years, if not a decade.
Knowledge graph36 entities · 28 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore
36 entities
Chapters5 moments

Key Moments

Transcript42 segments

Full Transcript

Topics15 themes

What’s Discussed

AI AgentsArtificial IntelligenceReinforcement LearningContinual LearningMultimodal AICognitive LimitationsEvolutionary ProcessesLLM JudgesModel-Powered AutocompleteSelf-Driving TechnologyMarch of NinesEngineering RealismInternet DataSoftware DevelopmentAI Timelines
Smart Objects36 · 28 links
People· 2
Products· 6
Companies· 3
Concepts· 22
Event· 1
Medias· 2