Skip to main content

Modular AI Models: Adding Languages Like Lego Blocks with Pathway's Adrian Kosowski

Super Data Science: ML & AI Podcast with Jon KrohnOctober 13, 20258 min423 views
18 connections·30 entities in this video→

Modular AI Architecture: BDH

  • πŸ’‘ The BDH architecture allows for the concatenation of separate language models, such as English and French, to create a multilingual model with sparse activation.
  • 🧠 This approach contrasts with transformers, where connecting models is not straightforward, offering a simpler scaling dimension.

Performance and Scale

  • πŸš€ BDH models, like the 1 billion parameter "baby dragon hatchling" model, perform comparably to or outperform existing models of similar scale, such as GPT-2.
  • βš™οΈ A key advantage is the energy and compute efficiency achieved by the BDH architecture.
  • πŸ§ͺ The focus on moderate scale (1B parameters) facilitates ease and speed of experimentation, particularly for instruction following and basic language model capabilities.

Reasoning Models and Future Potential

  • 🎯 The most promising avenue for BDH is in developing reasoning models that involve multiple phases of refinement and accuracy checks.
  • πŸ“ˆ These models are adept at working with contextualized inputs and can process vast amounts of data, potentially billions of tokens.
  • πŸ’» A key use case is an AI coding assistant that can understand and operate within large codebases, internalizing existing code before generating new contributions.
  • πŸ“š The architecture can ingest and make sense of large datasets, such as private enterprise documentation, in a matter of minutes.
Knowledge graph30 entities Β· 18 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover Β· drag to explore
30 entities
Chapters4 moments

Key Moments

Transcript29 segments

Full Transcript

Topics13 themes

What’s Discussed

Modular AI ModelsBDH ArchitecturePathway AIMultilingual ModelsSparse ActivationTransformer LimitationsReasoning ModelsLarge Language ModelsCompute EfficiencyEnergy EfficiencyContextualized InputsAI Coding AssistantScalability
Smart Objects30 Β· 18 links
ConceptsΒ· 20
ProductsΒ· 6
MediasΒ· 3
CompanyΒ· 1