Skip to main content

Building Agentic AI: Design Patterns, Evaluation, and Optimization with Sinan Ozdemir

Super Data Science: ML & AI Podcast with Jon KrohnJanuary 21, 20261h 4min6,905 views
22 connections·40 entities in this video→

Agentic AI vs. Workflows

  • πŸ’‘ An agent is defined as an LLM with access to tools, capable of deciding which tools to use and in what order.
  • βš™οΈ A workflow, conversely, is a deterministic data and code path where the LLM's actions are predetermined and it does not choose its next step.
  • ❓ To distinguish between them, analyze the existing process: if there are many conditional branching points, it suggests an agentic approach.

LLM Parameter Counts and Context Windows

  • πŸ“ Small models (under 10 billion parameters) can run on a CPU and are suitable for simple retrieval tasks.
  • πŸš€ Medium-sized models (10-100 billion parameters) enable multi-turn agentic tasks and can be enhanced with fine-tuning.
  • 🏒 Large models (100 billion+ parameters) are necessary for enterprise-wide, multilingual deployments and complex tasks.
  • πŸͺž Larger context windows are crucial for agents performing long-horizon tasks, but the LLM must also be capable of reasoning over the entire context.

Evaluating AI Performance

  • 🎯 Accuracy alone is insufficient; evaluation must consider task-specific metrics.
  • βš–οΈ Precision is vital when false positives are expensive, measuring how often a 'yes' prediction is correct.
  • πŸ“‰ Recall is critical when false negatives are expensive, measuring how many of the correct 'yes' instances were identified.
  • πŸ§ͺ Reproducible experiments are essential, with evaluation language integrated into case studies.

Hybrid Systems and Optimization

  • 🧩 Hybrid systems, combining predefined workflows with agentic behavior, are often the most powerful AI applications.
  • ⚠️ Without a predefined pathway, a sophisticated auditing system is needed to ensure tasks stay on track.
  • πŸ› οΈ Optimization techniques like quantization, distillation, and LoRA aim to reduce cost and increase speed, but practitioners should expect a performance hit and potential differences in output compared to larger, unoptimized models.

Surprising Findings in AI Research

  • 🀯 A surprising finding is the lack of consistent correlation between reasoning capabilities and LLM performance on certain benchmarks.
  • πŸ“ˆ Even when reasoning improves performance, the gains are often marginal (1-2%) and may not outweigh the increased cost.
  • 🧭 Speculative decoding can offer speed and memory benefits, but its effectiveness is task-dependent, allowing for prediction of which questions will benefit most.
Knowledge graph40 entities Β· 22 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover Β· drag to explore
40 entities
Chapters19 moments

Key Moments

Transcript237 segments

Full Transcript

Topics15 themes

What’s Discussed

Agentic AILLMWorkflowsParameter CountContext WindowAI EvaluationPrecisionRecallHybrid SystemsQuantizationDistillationLoRAReasoning ModelsSpeculative DecodingFine-tuning
Smart Objects40 Β· 22 links
PeopleΒ· 3
CompaniesΒ· 3
ProductsΒ· 5
ConceptsΒ· 25
MediasΒ· 3
LocationΒ· 1