Skip to main content

ANDREJ KARPATHY 2025 LLM Review: RLVR, Jagged Intelligence, & The Vibe Coding Revolution

[HPP] Andrej KarpathyDecember 21, 202535 min
28 connections·40 entities in this video

The Rise of RLVR and Algorithmic Reasoning

  • 💡 Reinforcement Learning from Verifiable Rewards (RLVR) replaced RLHF, becoming the new standard for LLM training by using objective computational environments for rewards.
  • 🎯 This method allows for deep, intensive optimization in domains like math and code, leading to the spontaneous emergence of algorithmic reasoning strategies.
  • 🧠 Models learn to break down complex problems into intermediate steps and perform error recovery, managing internal operations like a working memory.
  • 📈 Compute allocation shifted from pre-training to these deep RL runs, enabling a new scaling law where test-time compute can increase intelligence on demand.

Jagged Intelligence and Benchmark Challenges

  • 👻 LLMs are described as "summoned ghosts" with wildly uneven capabilities, excelling in verifiable domains but struggling with subjective areas like common sense.
  • ⚠️ This jagged performance stems from the hyper-specific optimization of RLVR, which lacks strong objective reward signals for non-verifiable domains.
  • 📊 The industry faced a benchmark crisis as labs "benchmaxed" by optimizing models specifically for verifiable tests, leading to a decoupling of benchmark scores from generalizable real-world capability.

The Thick LLM App Layer and Local Agents

  • 🛠️ A thick LLM app layer is essential for reliability, handling complex tasks like context engineering and orchestrating multiple LLM calls into directed acyclic graphs (DAGs).
  • 🚀 This layer verticalizes generalist models into specialists by integrating private data, sensors, actuators, and real-world feedback loops.
  • 💻 Local host LLM agents, like Claude Code, provide low-latency, high-fidelity access to a user's environment, proving architecturally superior for managing brittle, jagged agency in development.

Vibe Coding and UI Evolution

  • Vibe coding emerged as a revolution, allowing users to create functional programs from high-level English instructions, making code ephemeral, malleable, and discardable.
  • 💰 This democratizes programming and enables professionals to build specialized, temporary tools with near-zero creation cost, shifting focus from boilerplate to high-level architecture.
  • 🖼️ The future of LLM interfaces, hinted at by Nano Banana, moves beyond text-based chat to a unified multimodal generative experience with deeply integrated text, image, and world knowledge.
  • 🌐 This aims to create intuitive, dynamic, spatially organized LLM GUIs that better align with human visual and spatial preferences.
Knowledge graph40 entities · 28 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore
40 entities
Chapters4 moments

Key Moments

Transcript133 segments

Full Transcript

Topics15 themes

What’s Discussed

Reinforcement Learning from Verifiable Rewards (RLVR)LLM TrainingAlgorithmic ReasoningTest-Time ComputeJagged IntelligenceBenchmark CrisisBenchmaxingLLM App LayerContext EngineeringOrchestrationFeedback LoopsLLM AgentsLocal Host ParadigmVibe CodingMultimodal Generative Experience
Smart Objects40 · 28 links
Concepts· 29
Company· 1
Products· 6
Medias· 2
Events· 2