Skip to main content

Build Advanced AI Agents: Voice, Research & Multi-Agent Workflows

freeCodeCamp.orgSeptember 22, 202549 min60,620 views
27 connections·40 entities in this video→

Building Advanced AI Agents

  • πŸ’‘ This workshop series focuses on building three types of advanced AI agents: voice agents, deep research tools, and multi-agent workflows.
  • πŸš€ Participants will gain hands-on experience with popular AI tools and frameworks, including LiveKit, Exa, LangChain, and Cerebras.
  • πŸ› οΈ The series provides sample code, practical exercises, and open-source repositories to facilitate building functional AI agents.

Workshop 1: Voice Sales Agent with LiveKit

  • πŸ—£οΈ Learn to build a real-time voice sales agent capable of natural conversations, pulling product context, and responding intelligently.
  • 🧠 The agent's pipeline involves Automatic Speech Recognition (ASR), Voice Activity Detection (VAD), Large Language Models (LLMs) for thinking, and Text-to-Speech (TTS) for speaking.
  • ⚑ Cerebras provides ultra-fast inference, reducing response times from seconds to milliseconds, crucial for natural conversation.
  • 🀝 LiveKit acts as middleware, managing audio streams, context, and orchestrating AI services.
  • πŸ’¬ Cartisia's Ink engine handles speech-to-text and text-to-speech with real-time accuracy and low latency.
  • 🧩 Retrieval Augmented Generation (RAG) is used to load business-specific information, like product descriptions and pricing, into the LLM's context to minimize hallucinations.
  • πŸ‘₯ The system can be expanded into a multi-agent system with specialized agents for technical support or pricing inquiries, enabling seamless handoffs.

Workshop 2: Deep Research Assistant with Exa

  • πŸ” Build an AI research assistant that can autonomously search the web, analyze multiple sources, and provide structured insights in under 30 seconds.
  • 🌐 Exa's search API is used to find relevant web content, returning full page content for comprehensive analysis.
  • 🧠 The process involves breaking down research questions, using RAG to provide LLMs with context, and identifying knowledge gaps for follow-up searches.
  • πŸš€ Cerebras' fast inference is critical for chaining multiple LLM calls, enabling complex research tasks to be completed rapidly.
  • 🧩 A recursive approach is implemented where the LLM identifies gaps and triggers new searches, leading to richer, more comprehensive results.
  • πŸ§‘β€πŸ”¬ Advanced implementations can involve a lead agent breaking down queries into subtasks for multiple specialized agents working in parallel.

Workshop 3: User Research Automation with LangChain

  • πŸ“Š Automate user research by generating user personas, conducting simulated interviews, and synthesizing feedback in under 60 seconds.
  • πŸ€– The system uses AI to generate interview questions, create diverse AI personas, run simulated interviews, and analyze responses.
  • ⏱️ This process compresses weeks of traditional user research into minutes, significantly accelerating innovation speed.
  • βš™οΈ LangGraph is used for workflow orchestration, defining nodes (Python functions) for each research step, and managing state across the process.
  • 🧠 Cerebras powers the LLM for fast generation of personas, interview questions, responses, and insights.
  • 🀝 Specialized agents (nodes) handle tasks like configuration, persona creation, interviews, and synthesis, updating a shared state object for coherent workflows.
  • πŸ“ˆ The system can be expanded with multi-question interviews, follow-up questions, conversation memory, and enhanced synthesis for deeper insights.
Knowledge graph40 entities Β· 27 connections

How they connect

An interactive map of every person, idea, and reference from this conversation. Hover to trace connections, click to explore.

Hover Β· drag to explore
40 entities
Chapters18 moments

Key Moments

Transcript187 segments

Full Transcript

Topics15 themes

What’s Discussed

AI AgentsVoice AgentsLiveKitCerebrasExaLangChainLangGraphLarge Language Models (LLMs)Retrieval Augmented Generation (RAG)Speech RecognitionText-to-SpeechMulti-Agent SystemsUser Research AutomationWeb Search APIInference Speed
Smart Objects40 Β· 27 links
CompaniesΒ· 6
ConceptsΒ· 18
ProductsΒ· 13
PeopleΒ· 3