Skip to main content

RAG LLMs in Regulated Industries: Risks and Safeguards with Sebastian Gehrmann

Super Data Science: ML & AI Podcast with Jon KrohnJuly 21, 20256 min208 views
9 connections·16 entities in this video→

Limitations of Generic LLMs in Finance

  • 🎯 Foundation models are not typically trained on finance-specific corporate knowledge, leading to limitations in both helpfulness and harmlessness.
  • ⚠️ Out-of-the-box safeguards like Llama Guard or Shield Llama are designed for general populations and productivity tasks, not the specific risks and obligations of financial services.
  • πŸ’‘ The same limitations apply to other highly domain-specific, regulation-heavy sectors such as healthcare and law.

Evaluating RAG Systems in Context

  • πŸ”‘ A core recommendation is to evaluate any AI system within the specific context it will be deployed.
  • πŸ”¬ For healthcare, evaluate with healthcare subject matter experts; for financial services, use financial services experts.
  • πŸ“Š Don't rely solely on benchmark scores from LLM providers; invest heavily in domain-specific evaluation.

Mitigating Risks in Regulated Domains

  • πŸ› οΈ Utilize existing frameworks like the NIST AI Risk Management Framework as starting points and adapt them to your specific domain.
  • 🀝 Industry collaborations like ML Commons offer general-purpose taxonomies that can be customized.
  • πŸ” Organize red teaming events where users actively try to break the system to identify and quantify risks.
  • βœ… Measuring how often the system fails or provides incorrect information is crucial for understanding the risk surface.

Building Trustworthy AI Systems

  • πŸš€ By evaluating in context and actively measuring risks, organizations can build systems that are more trustworthy, reliable, and robust.
  • πŸ“ˆ This leads to better user adoption, as users are less likely to abandon the system after encountering bad answers.
Knowledge graph16 entities Β· 9 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover Β· drag to explore
16 entities
Chapters3 moments

Key Moments

Transcript23 segments

Full Transcript

Topics13 themes

What’s Discussed

Retrieval-Augmented Generation (RAG)Large Language Models (LLMs)Generative AIFinancial ServicesRisk ManagementAI SafetyDomain-Specific AIHealthcare AIRegulatory ComplianceAI EvaluationRed TeamingNIST AI Risk Management FrameworkFoundation Models
Smart Objects16 Β· 9 links
ConceptsΒ· 9
LocationsΒ· 2
ProductsΒ· 3
CompanyΒ· 1
EventΒ· 1