RAG LLMs in Regulated Industries: Risks and Safeguards with Sebastian Gehrmann

Super Data Science: ML & AI Podcast with Jon KrohnJuly 21, 20256 min208 views

9 connections·16 entities in this video→

Capture as you watch

Save any video to veridive in one click.

The free veridive Chrome extension pulls the transcript from any YouTube video or podcast you're watching — ready to ask, cite, and connect.

Limitations of Generic LLMs in Finance

🎯 Foundation models are not typically trained on finance-specific corporate knowledge, leading to limitations in both helpfulness and harmlessness.
⚠️ Out-of-the-box safeguards like Llama Guard or Shield Llama are designed for general populations and productivity tasks, not the specific risks and obligations of financial services.
💡 The same limitations apply to other highly domain-specific, regulation-heavy sectors such as healthcare and law.

Evaluating RAG Systems in Context

🔑 A core recommendation is to evaluate any AI system within the specific context it will be deployed.
🔬 For healthcare, evaluate with healthcare subject matter experts; for financial services, use financial services experts.
📊 Don't rely solely on benchmark scores from LLM providers; invest heavily in domain-specific evaluation.

Mitigating Risks in Regulated Domains

🛠️ Utilize existing frameworks like the NIST AI Risk Management Framework as starting points and adapt them to your specific domain.
🤝 Industry collaborations like ML Commons offer general-purpose taxonomies that can be customized.
🔍 Organize red teaming events where users actively try to break the system to identify and quantify risks.
✅ Measuring how often the system fails or provides incorrect information is crucial for understanding the risk surface.

Building Trustworthy AI Systems

🚀 By evaluating in context and actively measuring risks, organizations can build systems that are more trustworthy, reliable, and robust.
📈 This leads to better user adoption, as users are less likely to abandon the system after encountering bad answers.

Ask, don't scrub

Discover the spoken web.

veridive answers questions with exact timestamps and citations — across every podcast, video, and article you've saved.

Knowledge graph16 entities · 9 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore

16 entities

Chapters3 moments

Key Moments

Transcript23 segments

Full Transcript

Follow the thread

Find every place these ideas show up.

veridive maps the same people, claims, and topics across thousands of sources — so you can trace an idea from one conversation to the next.

Topics13 themes

What’s Discussed

Retrieval-Augmented Generation (RAG)Large Language Models (LLMs)Generative AIFinancial ServicesRisk ManagementAI SafetyDomain-Specific AIHealthcare AIRegulatory ComplianceAI EvaluationRed TeamingNIST AI Risk Management FrameworkFoundation Models

Smart Objects16 · 9 links

Concepts· 9

Locations· 2

Products· 3

Company· 1

Event· 1

Hours of content, seconds to the answer.

Save what you listen to. Ask it anything. Watch the threads between sources surface on their own.

Get started free