Why Extra Reasoning in AI Can Be Ineffective and Costly

Super Data Science: ML & AI Podcast with Jon KrohnJanuary 27, 20265 min103 views

10 connections·16 entities in this video→

Capture as you watch

Save any video to veridive in one click.

The free veridive Chrome extension pulls the transcript from any YouTube video or podcast you're watching — ready to ask, cite, and connect.

Counterintuitive Findings in AI Reasoning

💡 The book "Building Agentic AI" explores surprising results from benchmarking AI reasoning models, specifically using the Math QA dataset.
📉 Contrary to expectations, there was no obvious correlation found between the level of reasoning and an LLM's performance on this dataset.
⚠️ While some benchmarks might show a correlation, it's often a marginal 1-2% increase in accuracy accompanied by a 2x increase in cost.

The Cost-Benefit of AI Reasoning

💰 The added cost of enabling reasoning in AI models often outweighs the marginal performance gains.
🛠️ It can be more effective to invest time in prompt engineering and building smaller, more succinct reasoning traces.
⚡ While not always the case, it's crucial to understand that increased reasoning is not a guarantee of better task performance.

Understanding Reasoning Models

🧠 Reasoning models, like OpenAI's 01, differ from typical LLMs by incorporating an internal review process before outputting a response.
⏳ This behind-the-scenes processing can lead to significant delays, with benchmarks taking hours for a small number of calls, contrasting with the seconds or minutes for immediate output models.
💬 The core mechanism of these reasoning models is still next token prediction, but they are trained to produce and review reasoning steps, which can sometimes be induced by simple prompts like "think through the problem before answering."

The Humorous Side of AI Reasoning

😂 An amusing experiment involved simply saying "hello" to reasoning models to observe the amount of reasoning they would produce in response.
📄 Some models generated extensive, multi-paragraph responses, attempting to infer user intent or task under the guise of a simple greeting, highlighting the potential for over-processing.

Ask, don't scrub

Discover the spoken web.

veridive answers questions with exact timestamps and citations — across every podcast, video, and article you've saved.

Knowledge graph16 entities · 10 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover · drag to explore

16 entities

Chapters3 moments

Key Moments

Transcript22 segments

Full Transcript

Follow the thread

Find every place these ideas show up.

veridive maps the same people, claims, and topics across thousands of sources — so you can trace an idea from one conversation to the next.

Topics10 themes

What’s Discussed

Agentic AIAI ReasoningLLM BenchmarkingMath QA DatasetPrompt EngineeringAI CostsModel PerformanceOpenAI 01Next Token PredictionArtificial Intelligence

Smart Objects16 · 10 links

Concepts· 5

Companies· 4

Medias· 2

Products· 4

Person· 1

Hours of content, seconds to the answer.

Save what you listen to. Ask it anything. Watch the threads between sources surface on their own.

Get started free