Skip to main content

Production AI Log Analysis: Hidden Signals with Scott Clark of Distributional

Jason LiuFebruary 13, 202639 min76 views
28 connections·40 entities in this video→

The Evolving Landscape of AI Systems

  • πŸš€ AI systems have become significantly more non-deterministic, chaotic, and non-stationary compared to a decade ago, making traditional monitoring insufficient.
  • πŸ’‘ Understanding these systems requires moving beyond basic monitoring to AI analytics, which helps discover, understand, track, and fix hidden behavioral signals.
  • 🎯 The goal is to accelerate the AI data flywheel, enabling continuous improvement by observing real-world behavior and using it to enhance applications.

Bridging the Observability Gap

  • πŸ” Traditional observability tools offer either high-level monitoring (system status, basic drift) or deep dives into specific sessions (logging, tracing), leaving a gap in understanding broader agent behavior.
  • 🧩 Agent analytics treats the agent as the atomic unit, analyzing patterns and clusters of behavior across many sessions, similar to user analytics for traditional web apps.
  • πŸ› οΈ This analytics layer helps complete the AI software development loop by informing better evaluations, reward functions for fine-tuning, and system prompt improvements.

Distributional's Approach to AI Analytics

  • πŸ“Š Distributional's tool focuses on analyzing production AI logs, particularly OpenTelemetry traces, to identify patterns and signals within vast amounts of data.
  • 🧠 The system enriches trace data with behavioral signals, performs unsupervised learning and clustering to find behavioral pockets, and uses an LLM to explain these findings and suggest fixes.
  • βœ… The product is free and openly distributed, designed for on-premise deployment, allowing users to analyze their own AI agents without sending data externally.

The AI Data Flywheel in Practice

  • πŸ“ˆ The core idea is to deploy, observe, and improve, with analytics providing the crucial insights needed for continuous enhancement.
  • πŸ’‘ Analytics is positioned as a necessary step beyond basic logging and monitoring, offering richness and guidance for system improvement.
  • ⚠️ It helps uncover unknown unknownsβ€”issues and behaviors that were not anticipated during development or testing.

Integrating Analytics into AI Development

  • 🌐 Data can be ingested via OpenTelemetry, Parquet files, or SQL, with the goal of using the richest possible data, including user feedback and session-level events.
  • 🧩 By creating behavioral vectors for each trace, Distributional can identify subclusters of behavior correlated with issues like cost, latency, or quality.
  • πŸš€ The ultimate aim is to make analytics more accessible, enabling teams to focus on building agents rather than sifting through raw data, thereby driving continuous improvement and innovation.
Knowledge graph40 entities Β· 28 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover Β· drag to explore
40 entities
Chapters19 moments

Key Moments

Transcript148 segments

Full Transcript

Topics15 themes

What’s Discussed

AI AnalyticsProduction LoggingObservabilityGenerative AIAgentic SystemsOpenTelemetryData FlywheelLLMUnsupervised LearningClusteringSystem Prompt EngineeringFine-tuningReinforcement LearningDistributionalSigopt
Smart Objects40 Β· 28 links
PeopleΒ· 5
ConceptsΒ· 13
CompaniesΒ· 5
ProductsΒ· 15
MediasΒ· 2