Flax NNX Model Optimization with Optax: Core Workflow (Part 1)

Google for DevelopersDecember 3, 20258 min172 views

18 connections·23 entities in this video→

Defining Flax NNX Models

💡 Flax NNX provides an object-oriented way to define neural network models, similar to PyTorch's nn.Module.
🔑 Parameters in NNX are defined as attributes of a module using flax.nx.param and are initialized with random number generator seeds.
🎯 The __call__ method defines the forward pass of the model.

Optax for Optimization

🚀 Optax is the primary optimization library in the JAX ecosystem, designed for composability.
🧩 Instead of monolithic optimizers, Optax uses smaller, focused gradient transformations that can be chained together.
🛠️ The flax.nx.optimizer class connects NNX models with Optax, requiring an optx.grad_transformation and a new wrt (with respect to) argument to specify which parameters to optimize.

Gradient Calculation and Updates

🧠 Flax NNX uses nx.value_and_grad to compute both the loss value and gradients, returning new values each time, eliminating the need for optimizer.zero_grad.
📈 The optimizer_state.update method applies the computed gradients to update model parameters and the optimizer's internal state (e.g., momentum).
🔄 This update process is analogous to PyTorch's optimizer.step.

JIT Compilation for Performance

⚡ A typical training step function is decorated with nx.jit for significant performance gains from JAX's JIT compiler.
🔁 The train_step function encapsulates the loss calculation, gradient computation, and parameter updates, taking optimizer state and data as input and returning updated state and loss.
📊 This functional paradigm of passing state in and getting updated state out is common in JAX.

Foundational Training Loop

⚙️ A simplified training loop iterates, calling the train_step function and updating the optimizer state for the next iteration.
✅ This first episode establishes the foundational knowledge for defining models, optimizing them with Optax, and running a JIT-compiled training loop in the JAX ecosystem.

Knowledge graph23 entities · 18 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore

23 entities

Ask, don't scrub

Have a conversation with this video.

VERIDIVE answers questions with exact timestamps and citations — across every podcast, video, and article you've saved.

Chapters4 moments

Key Moments

Transcript32 segments

Full Transcript

Follow the thread

Find every place these ideas show up.

VERIDIVE maps the same people, claims, and topics across thousands of sources — so you can trace an idea from one conversation to the next.

Topics12 themes

What’s Discussed

Flax NNXOptaxJAXNeural Network OptimizationGradient DescentPyTorchMachine LearningDeep LearningJIT CompilationTraining LoopModel DefinitionGradient Transformations

Smart Objects23 · 18 links

Products· 13

Concepts· 7

Media· 1

Location· 1

Company· 1

Hours of content, seconds to the answer.

Save what you listen to. Ask it anything. Watch the threads between sources surface on their own.

Get started free