For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Ask AIPlaygroundLoginFree API Key
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
  • Get Started
    • Overview
    • Build a Voice Agent
      • Python
      • JavaScript
      • C#
      • Go
    • Feature Overview
    • Template Apps
  • Configure
    • Overview
    • STT Models
    • LLM Models
    • TTS Models
    • Media Inputs & Outputs
    • Prompting Voice Agents
    • Multilingual Voice Agents
    • Maintaining Context
    • Reusable Agent Configurations
  • Build
    • Multi-Agent Architecture
  • Connect
  • Controls
  • Optimize
    • Voice Agent TTS Controls
    • Message Flow
    • Audio & Playback
    • Audio Preprocessing & Barge-In
    • Adaptive Echo Cancellation
  • Resources
    • SDKs
    • UI Components
    • API Reference
LogoLogo
Ask AIPlaygroundLoginFree API Key
On this page
  • How it works
  • Choose your language
  • Next steps
  • Implementation examples
  • Rate limits
  • Usage tracking
Get Started

Build a Voice Agent

Learn how to build a real-time voice agent using Deepgram’s Agent API.

Was this page helpful?
Previous

Build a Voice Agent with Python

Create a real-time voice agent using the Deepgram Python SDK.

Next
Built with

Deepgram’s Voice Agent API uses a single WebSocket connection to handle the entire conversational loop. The API integrates speech-to-text, a large language model (LLM), and text-to-speech into one stream.

How it works

Building a voice agent involves four main steps over a WebSocket:

  1. Open a connection: Connect to the Deepgram Agent endpoint using a supported SDK or a WebSocket client.
  2. Configure the agent: Send a Settings message to define the models, voices, and behavior.
  3. Stream audio: Send raw audio data to the agent.
  4. Handle events: Listen for transcripts, agent responses, and audio output.

The Voice Agent API is available on the EU endpoint at wss://api.eu.deepgram.com/v1/agent/converse. See Regional Endpoints for details.

Choose your language

Select a language to start building your voice agent. Each tutorial provides a complete, end-to-end implementation.

Python

Build a voice agent using the Deepgram Python SDK.

JavaScript

Build a voice agent using the Deepgram JavaScript SDK.

C#

Build a voice agent using the Deepgram .NET SDK.

Go

Build a voice agent using the Deepgram Go SDK.

Next steps

Once you understand the basics, you can explore more advanced configurations:

  • Browser Agent Overview: Add voice AI to your web applications.
  • Configure the Voice Agent: Learn about all available settings for models, voices, and audio formats.
  • API Reference: View the full WebSocket protocol specification.

Implementation examples

Check out these repositories for more complex voice agent implementations:

Use caseRuntime / LanguageRepo
Basic demoNode, TypeScript, JavaScriptDeepgram Voice Agent Demo
Medical assistantNode, TypeScript, JavaScriptMedical Assistant Demo
Twilio integrationPythonTwilio Voice Agent Demo
Text input demoNode, TypeScript, JavaScriptConversational AI Demo
Azure OpenAIPythonVoice Agent with OpenAI Azure
Function callingPython / FlaskFlask Agent Function Calling Demo

Rate limits

For information on concurrency limits, refer to the API Rate Limits documentation.

Usage tracking

Deepgram calculates usage based on WebSocket connection time. One hour of connection time equals one hour of API usage.

Deepgram API Playground
Try this feature out in our API Playground.