Skip to main content

Introducing EmbeddingGemma: On-Device Text Embeddings for Generative AI

Google for DevelopersSeptember 5, 20254 min116,739 views
7 connections·8 entities in this video→

EmbeddingGemma: State-of-the-Art On-Device Embeddings

  • πŸ’‘ EmbeddingGemma is a new 300 million parameter text embedding model designed for mobile-first AI and generative AI experiences directly on user hardware.
  • 🧠 Embeddings are numerical representations of text, transforming data into vectors that generative models can use for downstream tasks.
  • ⚑ The model is small, fast, and efficient, capable of running with as little as 300 megabytes of RAM due to quantization-aware training, while preserving state-of-the-art quality.

Key Features and Capabilities

  • 🎯 EmbeddingGemma generates embeddings of 768 dimensions but supports customization down to 128 dimensions using Matryoshka Representation Learning (MRL).
  • πŸš€ It is based on the same technology as Gemini embedding models, offering high-quality semantic search, fast information retrieval, and customized classification/clustering.
  • πŸ† The model achieves the best score on the massive text embedding benchmark for models under 500 million parameters and is trained across 100+ languages.

On-Device Performance and Privacy

  • πŸ”’ Engineered for on-device performance, EmbeddingGemma ensures efficient computations and minimal memory footprint, even on resource-constrained hardware.
  • πŸ›‘οΈ It facilitates on-device embedding of local documents, ensuring sensitive user data never leaves the device.
  • 🌐 Offline functionality means search and retrieval features work regardless of internet connectivity.

Building Generative AI Experiences

  • 🧩 Together with generative models like Gemma 3N, EmbeddingGemma enables powerful mobile-first generative AI experiences and efficient Retrieval Augmented Generation (RAG) pipelines.
  • πŸ’¬ Applications can leverage user context from data for more personalized responses, such as understanding a user's need for a carpenter based on context.
  • 🌐 An example demo shows a user querying previously opened articles or web pages using a browser extension that utilizes EmbeddingGemma for on-device embedding and retrieval.

Customization and Accessibility

  • πŸ› οΈ EmbeddingGemma is designed for customization, allowing users to fine-tune it for their specific domain or language.
  • 🀝 It works across popular platforms like Hugging Face and Kaggle, with example notebooks available in the Gemma cookbook.
  • ✨ This next generation of on-device embedding models is open for everyone, offering a small, fast, and efficient solution for developers.
Knowledge graph8 entities Β· 7 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover Β· drag to explore
8 entities
Chapters2 moments

Key Moments

Transcript15 segments

Full Transcript

Topics14 themes

What’s Discussed

EmbeddingGemmaText EmbeddingsGenerative AIOn-Device AIMobile AIQuantizationMatryoshka Representation Learning (MRL)Retrieval Augmented Generation (RAG)Semantic SearchInformation RetrievalCustom ClassificationOffline AIGoogle AIGemma
Smart Objects8 Β· 7 links
ProductsΒ· 2
ConceptsΒ· 2
MediasΒ· 2
CompaniesΒ· 2