Skip to main content

Jeremy Howard: Why Finetuning is Flawed and the Future of Continued Pre-training in AI

[HPP] Jeremy HowardOctober 9, 20256 min
15 connections·26 entities in this video

The Evolution of AI Training

  • 💡 Jeremy Howard, co-founder of Fast.ai, initially pioneered finetuning with the groundbreaking ULMFiT technique.
  • 🚀 In 2018, ULMFiT revolutionized AI by enabling powerful models with less data and computing power, contributing to AI democratization.
  • ✅ ULMFiT involved general pre-training on large datasets, followed by domain-specific and task-specific finetuning for specialized applications.

The Flaw: Catastrophic Forgetting

  • ⚠️ Howard now argues that traditional finetuning suffers from catastrophic forgetting, a major unforeseen problem.
  • 🧠 This phenomenon causes AI models to forget previously learned skills when acquiring new ones, effectively trading old knowledge for new.
  • 📉 An example is Code Llama, which became expert in coding but lost its general knowledge of history and science after finetuning.

Introducing Continued Pre-training

  • 🔑 Howard proposes continued pre-training as a revolutionary solution to overcome the limitations of finetuning.
  • 🔄 Unlike staged finetuning, this method views the entire training process as a single, continuous flow from start to finish.
  • 🎯 Key rules include combining all relevant data types (coding, general text, Q&A) from the outset and never discarding data to ensure knowledge retention.

Impact and Future Implications

  • 📈 Continued pre-training helps overcome the alignment tax, allowing models to specialize without losing their general knowledge.
  • 💡 This approach promises smarter and more stable AI models that retain broad capabilities while excelling in specific tasks.
  • 🌐 It aims to make high-quality AI accessible even for developers with limited computing resources, aligning with Howard's original goal of AI democratization.
Knowledge graph26 entities · 15 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore
26 entities
Chapters3 moments

Key Moments

Transcript25 segments

Full Transcript

Topics12 themes

What’s Discussed

Artificial IntelligenceFinetuningJeremy HowardFast.aiULMFiTCatastrophic ForgettingContinued Pre-trainingLarge Language Models (LLMs)AI DemocratizationAlignment TaxData QualityComputing Power
Smart Objects26 · 15 links
Concepts· 18
People· 2
Products· 3
Events· 2
Media· 1