Jeremy Howard: Why Finetuning is Flawed and the Future of Continued Pre-training in AI
[HPP] Jeremy HowardOctober 9, 20256 min
15 connections·26 entities in this video→The Evolution of AI Training
- 💡 Jeremy Howard, co-founder of Fast.ai, initially pioneered finetuning with the groundbreaking ULMFiT technique.
- 🚀 In 2018, ULMFiT revolutionized AI by enabling powerful models with less data and computing power, contributing to AI democratization.
- ✅ ULMFiT involved general pre-training on large datasets, followed by domain-specific and task-specific finetuning for specialized applications.
The Flaw: Catastrophic Forgetting
- ⚠️ Howard now argues that traditional finetuning suffers from catastrophic forgetting, a major unforeseen problem.
- 🧠 This phenomenon causes AI models to forget previously learned skills when acquiring new ones, effectively trading old knowledge for new.
- 📉 An example is Code Llama, which became expert in coding but lost its general knowledge of history and science after finetuning.
Introducing Continued Pre-training
- 🔑 Howard proposes continued pre-training as a revolutionary solution to overcome the limitations of finetuning.
- 🔄 Unlike staged finetuning, this method views the entire training process as a single, continuous flow from start to finish.
- 🎯 Key rules include combining all relevant data types (coding, general text, Q&A) from the outset and never discarding data to ensure knowledge retention.
Impact and Future Implications
- 📈 Continued pre-training helps overcome the alignment tax, allowing models to specialize without losing their general knowledge.
- 💡 This approach promises smarter and more stable AI models that retain broad capabilities while excelling in specific tasks.
- 🌐 It aims to make high-quality AI accessible even for developers with limited computing resources, aligning with Howard's original goal of AI democratization.
Knowledge graph26 entities · 15 connections
How they connect
An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.
Hover · drag to explore
26 entities
Chapters3 moments
Key Moments
Transcript25 segments
Full Transcript
Topics12 themes
What’s Discussed
Artificial IntelligenceFinetuningJeremy HowardFast.aiULMFiTCatastrophic ForgettingContinued Pre-trainingLarge Language Models (LLMs)AI DemocratizationAlignment TaxData QualityComputing Power
Smart Objects26 · 15 links
Concepts· 18
People· 2
Products· 3
Events· 2
Media· 1