Use Deepgram's Voice Agent API to build an interactive voice agent.
To learn more about building a Voice Agent see Get Started with the Voice Agent API.
Endpoint
Path: wss://agent.deepgram.com/agent |
Accepts
Type | Description |
---|---|
Messages | JSON formatted operations |
Headers
Header | Value | Description |
---|---|---|
Sec-WebSocket-Protocol | token, <DEEPGRAM_API_KEY> | Used to establish a WebSocket connection with a specific protocol, include your Deepgram API key for authentication |
Messages
Settings Configuration
The SettingsConfiguration
message is a JSON command that serves as an initialization step, setting up both the behavior of the voice agent and the audio transmission formats before any actual voice data is exchanged over the websocket connection. Learn More
{
"type": "SettingsConfiguration",
"audio": {
"input": { // optional. defaults to 16kHz linear16
"encoding": "",
"sample_rate": 24000 // defaults to 24k
},
"output": { // optional. see table below for defaults and allowed combinations
"encoding": "",
"sample_rate": 24000, // defaults to 24k
"bitrate": 0,
"container": ""
}
},
"agent": {
"listen": {
"model": "" // optional. default 'nova-2'
},
"think": {
"provider": {
"type": "" // see `LLM providers and models`
},
"model": "", // see `LLM providers and models`
"instructions": "", // optional (this is the LLM System prompt)
"functions": [
{
"name": "", // function name
"description": "", // tells the agent what the function does, and how and when to use it
"url": "", // the endpoint where your function will be called
"headers": [ // optional. Http headers to pass when calling the function. Only supports 'authorization'
{
"key": "authorization",
"value": ""
}
],
"method": "post", // the http method to use when calling the function
"parameters": {
"type": "object",
"properties": {
"item": { // the name of the input property
"type": "string", // the type of the input
"description":"" // the description of the input so the agent understands what it is
}
},
"required": ["item"] // the list of required input properties for this function to be called
}
}
]
},
"speak": {
"model": "" // default 'aura-asteria-en' for other providers see TTS Models documentation
}
},
"context": {
"messages": [], // LLM message history (e.g. to restore existing conversation if websocket connection breaks)
"replay": false // whether to replay the last message, if it is an assistant message
}
}
Update Instructions
The UpdateInstructions
message is a JSON command that you can use to give additional instructions to the Think model in the middle of a conversation. Learn More.
{
"type": "UpdateInstructions",
"instructions": "" // The new instructions to give
}
Update Speak
TheUpdateSpeak
message is a JSON command that you can use to change the Speak model in the middle of a conversation. Learn More.
{
"type": "UpdateSpeak",
"model": "" // The new Speak model
}
Inject Agent Message
To send the InjectAgentMessage
message, you need to send the following JSON message to the server. Learn More.
{
"type": "InjectAgentMessage",
"message": "" // The statement the agent should say
}
Function Call Response
To send the FunctionCallResponse
message, you need to send the following JSON message to the server. Learn More.
{
"type": "todo"
}
Agent Keep Alive
To send the KeepAlive
message, you need to send the following JSON message to the serve. Learn More.
{
"type": "KeepAlive"
}
Server Messages
The Voice Agent API send different
JSON
Server Messages over the websocket. Please refer to the Documentation.
Keeping Connections Open
This Agent websocket supports keep alive messages to keep the connection open. To do so, send a keepAlive
message to the websocket. Learn more.
Errors and Warnings
WebSocket Close frame are not used. Closure reasons are conveyed through response messages with type Error
.
Errors
{
"type": "Error",
"message": "Text message received from client did not match any of the formats we expect."
}