The 8 Researchers Who Invented the Transformer AI Architecture

[HPP] Ashish VaswaniDecember 20, 202518 min

33 connections·40 entities in this video→

Capture as you watch

Save any video to veridive in one click.

The free veridive Chrome extension pulls the transcript from any YouTube video or podcast you're watching — ready to ask, cite, and connect.

The AI Bottleneck and a Radical Idea

💡 Before 2017, AI faced a "tyranny of time" due to recurrent neural networks (RNNs), which processed language sequentially, word by word.
🧠 This sequential bottleneck was a major frustration at Google Brain, preventing efficient use of vast datasets.
💬 Eight researchers, including Ashish Vaswani and Jacob Uszkoreit, challenged this paradigm, proposing a neural network that could process entire sentences simultaneously.

Inventing the Transformer Architecture

🚀 The team, joined by "magician" Noam Shazeer, developed the Transformer architecture, discarding RNNs entirely.
📝 Their groundbreaking paper, "Attention Is All You Need," introduced self-attention, allowing models to understand word relationships regardless of their position.
🛠️ Multi-head attention enabled the model to grasp multi-dimensional linguistic aspects, while Shazeer's scaling factor stabilized the mechanism for indefinite scaling.

Revolutionizing AI Training

⚡ The Transformer drastically reduced training times, allowing models to be trained in days instead of weeks.
📊 It achieved state-of-the-art results in machine translation and solved the critical problem of parallelizability.
📈 This innovation removed constraints on model size, enabling AI models to be trained on massive GPU clusters without diminishing returns.

Industry Impact and the Google Exodus

🌐 Google open-sourced the Transformer code, which became the foundation for OpenAI's GPT series, demonstrating scaling laws and few-shot learning.
💰 The architecture fueled the demand for Nvidia's GPUs, making them crucial commodities in the tech world.
🚪 Due to Google's hesitation to release generative AI, all eight authors eventually left the company, forming a "PayPal mafia" of AI startups.

Founding New AI Ventures

🌟 The researchers founded influential companies: Essential AI (Vaswani, Parmar) for enterprise automation, Cohere (Gomez) for business LLMs, and Inceptive (Uszkoreit) for biological software design.
🌍 Other ventures include Sakana AI (Jones) exploring successor architectures, Near Protocol/Near AI (Polosukhin) for decentralized AI, and Character.ai (Shazeer) for AI personas.
✅ Łukasz Kaiser joined OpenAI, and Noam Shazeer later returned to Google DeepMind, highlighting the Transformer's enduring legacy and its impact on the AI landscape.

Ask, don't scrub

Discover the spoken web.

veridive answers questions with exact timestamps and citations — across every podcast, video, and article you've saved.

Knowledge graph40 entities · 33 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore

40 entities

Chapters9 moments

Key Moments

Transcript69 segments

Full Transcript

Follow the thread

Find every place these ideas show up.

veridive maps the same people, claims, and topics across thousands of sources — so you can trace an idea from one conversation to the next.

Topics15 themes

What’s Discussed

Transformer architectureRecurrent Neural Networks (RNNs)Attention mechanismsSelf-attentionMulti-head attentionParallel processingMachine translationOpenAIGPT modelsScaling lawsGenerative AIGoogle BrainNvidia GPUsAI startupsNatural Language Processing (NLP)

Smart Objects40 · 33 links

Companies· 8

People· 11

Concepts· 13

Medias· 3

Location· 1

Products· 2

Events· 2

Hours of content, seconds to the answer.

Save what you listen to. Ask it anything. Watch the threads between sources surface on their own.

Get started free