Pipecat and Deepgram
Build a real-time voice AI agent using Pipecat with Deepgram speech-to-text and text-to-speech.
This guide walks you through building a voice AI agent that uses Pipecat for pipeline orchestration and Deepgram for speech-to-text (STT) and text-to-speech (TTS). By the end, you have a working voice agent that listens to a user, generates a response with an LLM, and speaks back in real-time.
Pipecat is an open-source Python framework for building voice and multimodal AI agents. It connects STT, LLM, and TTS services into a real-time pipeline and handles audio transport, turn-taking, and interruption detection.
Before you begin
Before you can use Deepgram, you need to create a Deepgram account. Signup is free and includes $200 in credit.
Daily is the WebRTC transport layer that handles audio between the browser and your agent. See the Pipecat Daily transport guide for more.
You need:
- A Deepgram API key
- A Daily API key
- An LLM API key — this guide uses OpenAI, but Pipecat supports other providers including Anthropic, Google, and Groq
- uv installed (for dependency management)
- The Pipecat CLI installed
- Python 3.11+
- Node.js 18+ (only if you add a JavaScript or React Pipecat client later)
Install or update the Pipecat CLI:
To update the CLI use:
Choose your developer experience
Creating a Pipecat + Deepgram integration can be accomplished by using several approaches. Choose the developer experience from the guides below that best fits your style. Note that all paths share the same prerequisite: the Pipecat CLI.
- Build with a Coding Agent
- Use the quickstart CLI command
- Scaffold a new Pipecat project with the CLI
Build with a Coding Agent
You can use AI coding tools like Claude Code or Codex to generate your Pipecat agent code. Rather than relying on the tool’s training data, you give it live context from the Pipecat documentation.
- Follow the Pipecat getting started guide to set up AI tools, connect the Pipecat Context Hub, and initialize a project.
- Start a coding session with a prompt like the example below.
The init command creates a GETTING_STARTED.md file with additional guidance for your coding agent.
Use the quickstart CLI command
The quickstart uses Deepgram for STT but Cartesia for TTS. Follow the instruction from the Pipecat Quickstart documentation, then switch to Deepgram using the steps below.
To switch TTS to Deepgram, open bot.py and find the Cartesia TTS setup:
Replace it with:
You can also remove CARTESIA_API_KEY from your .env file since it is no longer needed. No other changes are required. The STT service already uses Deepgram and the rest of the pipeline stays the same.
Continue building by adding a Pipecat Client
Use Flux for turn detection
Flux is Deepgram’s conversational STT model with built-in turn detection. It uses acoustic and semantic cues to determine when a speaker has finished their turn, resulting in more natural conversations.
To use Flux, replace DeepgramSTTService with DeepgramFluxSTTService in your bot.py:
Since Deepgram Flux provides its own user turn start and end detection, you should use ExternalUserTurnStrategies to let Flux handle turn management. See User Turn Strategies for configuration details.
Scaffold a new Pipecat project with the CLI
Step 1: Create the project
Scaffold a new project using the Pipecat CLI.
Step 2: Install dependencies
Navigate to the server directory inside your new project, create a virtual environment, and install the dependencies:
Step 3: Configure your environment
Copy the example environment file and fill in your API keys and set the default Deepgram voice:
Replace the placeholder values with your API keys:
- DEEPGRAM_API_KEY — from your Deepgram Console
- DEEPGRAM_VOICE_ID=aura-2-thalia-en — See: Deepgram Voices & Languages for a full list of options. Leaving this value empty may result in a 400 error.
- OPENAI_API_KEY — from your OpenAI dashboard
- DAILY_API_KEY — from your Daily dashboard. Daily is the WebRTC transport layer that handles audio between the browser and your agent. See the Pipecat Daily transport guide for more.
The remaining values are defaults you can change later.
Step 4: Run the agent
Start the bot from the server directory:
Step 5: Test the conversation
Open the local URL printed in your terminal, then:
- Select Daily from the Transport list and click Connect.
- Allow microphone access and speak to your agent.
- Ask a question and confirm the agent responds with speech.
- Speak while the agent is talking — it should stop and listen.
- Pause after speaking — the agent should detect the end of your turn and respond.
Continue building by adding a Pipecat Client
Next Steps
Continue building with an agent.
Follow the Pipecat getting started guide and ready the Pipecat Context Hub.
Prompt your agent to add a Pipecat client framework.
Example prompt:
Go further with Deepgram
- Voices — Deepgram offers 60+ voices across seven languages. Browse the voice library and update
DEEPGRAM_VOICE_IDin your.envfile. - Keyterm prompting — Improve recognition of domain-specific vocabulary by passing keyterms to Nova-3 via the STT service settings.
- Speaker diarization — Assign a speaker identifier to each word in the transcript using diarization via the STT service settings.
- Dynamic STT settings — Pipecat supports updating Deepgram STT settings without reconnecting. See the Pipecat Deepgram STT guide for details.