Configure the Voice Agent
Learn about the voice agent configuration options for the agent, and both input and output audio.
To configure the voice agent, you'll need to send a Settings Configuration message immediately after connection. Below is a detailed explanation of the configurations available.
Configuration Parameters
Parameter | Description |
---|---|
audio.input | The speech-to-text audio media input you wish to send. See options |
audio.output | The text-to-speech audio media output you wish to receive. See options |
agent.listen.model | The Deepgram speech-to-text model to be used. See options |
agent.think.model | Defines the LLM Model to be used. See options |
agent.think.provider.type | Defines the LLM Provider See options |
agent.think.instructions | Defines the System Prompt for the LLM |
agent.think.functions | Pass functions to the agent that will be called throughout the conversation if/when needed as described per function. See options |
agent.speak.model | The Deepgram text-to-speech model to be used. See options |
context.messages | Used to restore existing conversation if websocket connection break. |
context.reply | Used to to replay the last message, if it is an assistant message. |
Full Example
Below is an in-depth example showing all the fields available fields for SettingConfigurations
highlighting default values and optional properties.
{
"type": "SettingsConfiguration",
"audio": {
"input": { // optional. defaults to 16kHz linear16
"encoding": "",
"sample_rate": 24000 // defaults to 24k
},
"output": { // optional. see table below for defaults and allowed combinations
"encoding": "",
"sample_rate": 24000, // defaults to 24k
"bitrate": 0,
"container": ""
}
},
"agent": {
"listen": {
"model": "" // optional. default 'nova-2'
},
"think": {
"provider": {
"type": "" // see `LLM providers and models` table below
},
"model": "", // see `LLM providers and models` table below
"instructions": "", // optional (this is the LLM System prompt)
"functions": [
{
"name": "", // function name
"description": "", // tells the agent what the function does, and how and when to use it
"url": "", // the endpoint where your function will be called
"headers": [ // optional. Http headers to pass when calling the function. Only supports 'authorization'
{
"key": "authorization",
"value": ""
}
],
"method": "post", // the http method to use when calling the function
"parameters": {
"type": "object",
"properties": {
"item": { // the name of the input property
"type": "string", // the type of the input
"description":"" // the description of the input so the agent understands what it is
}
},
"required": ["item"] // the list of required input properties for this function to be called
}
}
]
},
"speak": {
"model": "" // default 'aura-asteria-en' for other providers see TTS Models documentation
}
},
"context": {
"messages": [], // LLM message history (e.g. to restore existing conversation if websocket connection breaks)
"replay": false // whether to replay the last message, if it is an assistant message
}
}
Updated about 12 hours ago