How Retrieval-Augmented Generation (RAG) Can Make LLMs Less Safe

Super Data Science: ML & AI Podcast with Jon KrohnJuly 16, 20253 min167 views

5 connections·9 entities in this video→

Capture as you watch

Save any video to veridive in one click.

The free veridive Chrome extension pulls the transcript from any YouTube video or podcast you're watching — ready to ask, cite, and connect.

RAG's Unexpected Safety Implications

💡 Contrary to common belief, Retrieval-Augmented Generation (RAG) can actually make Large Language Models (LLMs) less safe and their outputs less reliable.
⚠️ This research explored how RAG, when coupled with unsafe queries, can circumvent built-in safety mechanisms of LLMs, leading to unsafe responses even with innocuous retrieved documents.

Responsible AI and RAG Research

🔬 The research is part of a broader responsible AI initiative focused on identifying, blocking, and monitoring potential misuse of AI technology.
🎯 This is particularly crucial in heavily regulated industries where clients need assurance against accidental or purposeful abuse of AI tools.
📚 RAG is recognized as a necessary technology for grounding LLM responses in trusted data sources, especially when dealing with vast amounts of daily incoming data.

Findings on RAG and Unsafe Queries

📊 A study coupled unsafe queries (e.g., "How do I do insider trading?") with completely harmless documents from Wikipedia.
📉 The results showed that while LLMs might not originally respond to such queries, their responses often became unsafe when augmented by RAG with these harmless documents.
🎯 This highlights a critical need to understand and mitigate the potential risks introduced by RAG systems.

Ask, don't scrub

Discover the spoken web.

veridive answers questions with exact timestamps and citations — across every podcast, video, and article you've saved.

Knowledge graph9 entities · 5 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore

9 entities

Chapters2 moments

Key Moments

Transcript11 segments

Full Transcript

Follow the thread

Find every place these ideas show up.

veridive maps the same people, claims, and topics across thousands of sources — so you can trace an idea from one conversation to the next.

Topics8 themes

What’s Discussed

Retrieval-Augmented Generation (RAG)Large Language Models (LLMs)AI SafetyResponsible AIUnsafe QueriesLLM SecurityData GroundingAI Misuse

Smart Objects9 · 5 links

Concepts· 4

Companies· 2

Product· 1

Medias· 2

Hours of content, seconds to the answer.

Save what you listen to. Ask it anything. Watch the threads between sources surface on their own.

Get started free