Getting Started
Build real-time, interactive voice agents powered by Deepgram’s speech-to-text, LLM integration, and text-to-speech, all over a single WebSocket connection.
Deepgram’s Voice Agent API handles the full speech pipeline (listening, thinking, and speaking) so you can focus on what your agent does, not how it hears or talks.
Step-by-step guide to creating your first voice agent with Python, JavaScript, C#, or Go using the server-side SDKs and WebSocket API.
Set up speech-to-text models, LLM providers, TTS voices, endpointing, and audio formats for your voice agent.
Let your agent call external APIs and tools mid-conversation.
Build systems where multiple specialized agents hand off conversations based on context, intent, or domain expertise.
Explore Further
Save and reuse agent settings across projects.
Connect voice agents to phone networks for inbound and outbound calls.
Add voice AI to any web application. Four composable packages, from a single script tag to a fully custom React interface.
Write effective system prompts that shape how your agent behaves on a live call.
Clone a working voice agent project and start building.
Complete WebSocket protocol reference for the Agent API.
Full list of Voice Agent API capabilities.