Configure the Voice Agent
Learn about the voice agent configuration options for the agent, and both input and output audio.
Voice Agent
Built withTo configure the voice agent, you’ll need to send a Settings Configuration message immediately after connection. Below is a detailed explanation of the configurations available.
Configuration Parameters
Parameter | Description |
---|---|
audio.input | The speech-to-text audio media input you wish to send. The audio must be uncontainerized and have the following encodings: - Linear16 - alaw - mulaw |
audio.output | The text-to-speech audio media output you wish to receive. See options |
agent.listen.model | The Deepgram speech-to-text model to be used. See options |
agent.listen.keyterms | Keyterms you want increased recognition for. See more |
agent.think.model | Defines the LLM Model to be used. See options |
agent.think.provider.type | Defines the LLM Provider See options |
agent.think.instructions | Defines the System Prompt for the LLM |
agent.think.functions | Pass functions to the agent that will be called throughout the conversation if/when needed as described per function. See options |
agent.speak.model | The Deepgram text-to-speech model to be used. See options |
context.messages | Used to restore existing conversation if websocket connection break. |
context.replay | Used to to replay the last message, if it is an assistant message. |
Full Example
Below is an in-depth example showing all the fields available fields for SettingConfigurations
highlighting default values and optional properties.
JSON
1 { 2 "type": "SettingsConfiguration", 3 "audio": { 4 "input": { // optional. defaults to 16kHz linear16 5 "encoding": "", 6 "sample_rate": 24000 // defaults to 24k 7 }, 8 "output": { // optional. see table below for defaults and allowed combinations 9 "encoding": "", 10 "sample_rate": 24000, // defaults to 24k 11 "bitrate": 0, 12 "container": "" 13 } 14 }, 15 "agent": { 16 "listen": { 17 "model": "", // optional. default 'nova-3' 18 "keyterms": [] // optional, only available on nova 3 models 19 }, 20 "think": { 21 "provider": { 22 "type": "" // see `LLM providers and models` table below 23 }, 24 "model": "", // see `LLM providers and models` table below 25 "instructions": "", // optional (this is the LLM System prompt) 26 "functions": [ 27 { 28 "name": "", // function name 29 "description": "", // tells the agent what the function does, and how and when to use it 30 "url": "", // the endpoint where your function will be called 31 "headers": [ // optional. Http headers to pass when calling the function. Only supports 'authorization' 32 { 33 "key": "authorization", 34 "value": "" 35 } 36 ], 37 "method": "post", // the http method to use when calling the function 38 "parameters": { 39 "type": "object", 40 "properties": { 41 "item": { // the name of the input property 42 "type": "string", // the type of the input 43 "description":"" // the description of the input so the agent understands what it is 44 } 45 }, 46 "required": ["item"] // the list of required input properties for this function to be called 47 } 48 } 49 ] 50 }, 51 "speak": { 52 "model": "" // default 'aura-asteria-en' for other providers see TTS Models documentation 53 } 54 }, 55 "context": { 56 "messages": [], // LLM message history (e.g. to restore existing conversation if websocket connection breaks) 57 "replay": false // whether to replay the last message, if it is an assistant message 58 } 59 }