Configure the Voice Agent | Deepgram's Docs

You will need to migrate to the new Voice Agent API V1 to continue to use the Voice Agent API. Please refer to the Voice Agent API Migration Guide for more information.

To configure your Voice Agent, you’ll need to send a Settings message immediately after connection. This message configures the agent’s behavior, input/output audio formats, and various provider settings.

For more information on the Settings message, see the Voice Agent API Reference

Settings Overview

The Settings message is a JSON object that contains the following fields:

Settings

Parameter	Type	Description
`type`	String	Must be “Settings” to indicate this is a settings configuration message
`tags`	Array	Tags to associate with the request for filtered searching. Each tag is a string
`experimental`	Boolean	Enables experimental features. Defaults to `false`
`mip_opt_out`	Boolean	Opts out of MIP (Multi-turn Interaction Protocol). Defaults to `false`
`flags.history`	Boolean	Defaults to `true`. Set to `false` to disable function call history.

Audio

Parameter	Type	Description
`audio.input`	Object	The speech-to-text audio media input configuration
`audio.input.encoding`	String	The encoding format for the input audio. Defaults to `linear16`
`audio.input.sample_rate`	Integer	The sample rate in Hz for the input audio. Defaults to 16000
`audio.output`	Object	The text-to-speech audio media output configuration
`audio.output.encoding`	String	The encoding format for the output audio
`audio.output.sample_rate`	Integer	The sample rate in Hz for the output audio
`audio.output.bitrate`	Integer	The bitrate in bits per second for the output audio
`audio.output.container`	String	The container format for the output audio. Defaults to `none`

Agent

Parameter	Type	Description
`agent.language`	String	The language code for the agent. Defaults to `en`
`agent.context`	Object	Optional conversation context including history of messages and function calls
`agent.context.messages`	Array	Array of previous conversation messages and function calls to provide context to the agent
`agent.listen.provider.type`	Object	The speech-to-text provider type. Currently only Deepgram is supported
`agent.listen.provider.model`	String	The Deepgram speech-to-text model to be used
`agent.listen.provider.keyterms`	Array	The Keyterms you want increased recognition for
`agent.listen.provider.smart_format`	Boolean	Applies smart formatting to improve transcript readability (Deepgram providers only). Defaults to `false`
`agent.think.provider.type`	Object	The LLM Model provider type
`agent.think.provider.model`	String	The LLM model to use
`agent.think.provider.temperature`	Number	Controls the randomness of the LLM’s output. Range: 0-2 for OpenAI, Google, and Groq, 0-1 for Anthropic
`agent.think.endpoint`	Object	Optional if LLM provider is open_ai or anthropic. Required for 3rd party LLM providers such as google and groq. When present, must include `url` field and `headers` object
`agent.think.functions`	Array	Array of functions the agent can call during the conversation
`agent.think.functions.endpoint`	Object	The Function endpoint to call. if not passed, function is called client-side
`agent.think.prompt`	String	The system prompt that defines the agent’s behavior and personality
`agent.think.context_length`	Integer or String	Specifies the number of characters retained in context between user messages, agent responses, and function calls. This setting is only configurable when a custom think endpoint is used. Use `max` for maximum context length.
`agent.speak.provider.type`	Object	The TTS Model provider type. e.g., `deepgram`, `eleven_labs`, `cartesia`, `open_ai`, `aws_polly`
`agent.speak.provider.model`	String	The TTS Model to use for Deepgram or OpenAI
`agent.speak.provider.model_id`	String	The TTS Model ID to use for Eleven Labs or Cartesia
`agent.speak.provider.voice`	Object or String	Voice configuration. For Cartesia: use object with `mode` and `id`. For OpenAI: use a string value.
`agent.speak.provider.language`	String	Optional language setting for Cartesia provider
`agent.speak.provider.language_code`	String	Optional language code for Eleven Labs provider
`agent.speak.provider.engine`	String	Optional engine for AWS Polly provider
`agent.speak.provider.credentials`	Object	Optional credentials for AWS Polly provider. When present, must include `type`,`region`, `access_key_id`, `secret_access_key` and `session_token` if STS is used
`agent.speak.endpoint`	Object	Optional if TTS provider is Deepgram. Required for non-Deepgram TTS providers. When present, must include `url` field and `headers` object
`agent.greeting`	String	Optional initial message that the agent will speak when the conversation starts

`agent.language`

Choose your language setting based on your use case:
- If you know your input language, specify it directly for the best recognition accuracy.
- If you expect multiple languages or are unsure, use multi for flexible language support.
- Currently multi is only supported with Eleven Labs TTS.
- Refer to our supported languages to ensure you’re using the correct model (Nova-2 or Nova-3) for your selected language.

`agent.think.context_length`

Using max will set the context length to the maximum allowed based on the LLM provider you use. If the total context exceeds the model’s maximum, truncation is handled by the LLM provider.
Increasing the context length may help preserve multi-turn conversation history, especially when verbose function calls inflate the total context.
All characters sent to the LLM count toward the context limit, including fully serialized JSON messages, function call arguments, and responses. System messages are excluded and managed separately via agent.think.prompt.
The default context length set by Deepgram is optimized for cost and latency. It is not recommended to change this setting unless there’s a clear need.

`agent.context`

The agent.context object allows you to provide conversation history to the agent when starting a new session. This is useful for continuing conversations or providing background context.
The agent.context.messages array contains conversation history entries, which can be either conversational messages or function calls.
Conversational messages have the format: {"type": "History", "role": "user" | "assistant", "content": "message text"}
Function call messages have the format: {"type": "History", "function_calls": [{"id": "unique_id", "name": "function_name", "client_side": true/false, "arguments": "json_string", "response": "response_text"}]}
Use this feature to maintain conversation continuity across sessions or to provide the agent with relevant background information.
To disable function call history, set settings.flags.history to false in the Settings message.

`agent.listen.provider.smart_format`

The agent.listen.provider.smart_format setting is only available for Deepgram providers.
When set to true, Deepgram applies smart formatting to improve transcript readability.
Useful for UI-based apps that display Agent transcripts on screen, as it formats the text for better readability.
When set to false, Deepgram does not apply smart formatting.
The default value is false.

Full Example

Below is an in-depth example showing all the available fields for Settings with all the optional fields for individual provider specific settings.

JSON

1 {
2   "type": "Settings",
3   "tags": ["order", "customer_service"],
4   "experimental": false,
5   "mip_opt_out": false,
6   "flags": {
7     "history": true
8   },
9   "audio": {
10     "input": {
11       "encoding": "linear16",
12       "sample_rate": 24000
13     },
14     "output": {
15       "encoding": "mp3",
16       "sample_rate": 24000,
17       "bitrate": 48000,
18       "container": "none"
19     }
20   },
21   "agent": {
22     "language": "en",
23     "context": {
24       "messages": [
25         {
26           "type": "History",
27           "role": "user",
28           "content": "What's my order status?"
29         },
30         {
31           "type": "History",
32           "function_calls": [
33             {
34               "id": "fc_12345678-90ab-cdef-1234-567890abcdef",
35               "name": "check_order_status",
36               "client_side": true,
37               "arguments": "{\"order_id\": \"ORD-123456\"}",
38               "response": "Order #123456 status: Shipped - Expected delivery date: 2024-03-15"
39             }
40           ]
41         },
42         {
43           "type": "History",
44           "role": "assistant",
45           "content": "Your order #123456 has been shipped and is expected to arrive on March 15th, 2024."
46         }
47       ]
48     },
49     "listen": {
50       "provider": {
51         "type": "deepgram",
52         "model": "nova-3",
53         "keyterms": ["hello", "goodbye"],
54         "smart_format": false // Deepgram providers only
55       }
56     },
57     "think": {
58       "provider": {
59         "type": "open_ai",
60         "model": "gpt-4o-mini",
61         "temperature": 0.7
62       },
63       "endpoint": { // Optional for non-Deepgram LLM providers. When present, must include url field and headers object
64         "url": "https://api.example.com/llm",
65         "headers": {
66           "authorization": "Bearer {{token}}"
67         }
68       },
69       "prompt": "You are a helpful AI assistant focused on customer service.",
70       "context_length": 15000,  // Optional and can only be used when a custom think endpoint is used. Use max for maximum context length
71       "functions": [
72         {
73           "name": "check_order_status",
74           "description": "Check the status of a customer order",
75           "parameters": {
76             "type": "object",
77             "properties": {
78               "order_id": {
79                 "type": "string",
80                 "description": "The order ID to check"
81               }
82             },
83             "required": ["order_id"]
84           },
85           "endpoint": { // If not provided, function is called client-side
86             "url": "https://api.example.com/orders/status",
87             "method": "post",
88             "headers": {
89               "authorization": "Bearer {{token}}"
90             }
91           }
92         }
93       ]
94     },
95     "speak": {
96       "provider": {
97         "type": "deepgram",
98         "model": "aura-2-thalia-en", // For Deepgram or OpenAI providers
99         "model_id": "1234567890", // For Eleven Labs or Cartesia providers
100         "voice": {
101           "mode": "id", // For Cartesia provider only
102           "id": "a167e0f3-df7e-4d52-a9c3-f949145efdab" // For Cartesia provider only
103         },
104         "language": "en", // For Cartesia provider only
105         "language_code": "en-US", // For Eleven Labs provider only
106         "engine": "standard", // For AWS Polly provider only
107         "credentials": { // For AWS Polly provider only
108           "type": "IAM", // For AWS Polly provider only. Must be "IAM" or "STS"
109           "region": "us-east-1",
110           "access_key_id": "{{access_key_id}}",
111           "secret_access_key": "{{secret_access_key}}",
112           "session_token": "{{session_token}}" // Required for STS only
113         }
114       },
115       "endpoint": {
116         "url": "https://api.example.com/tts",
117         "headers": {
118           "authorization": "Bearer {{token}}"
119         }
120       }
121     },
122     "greeting": "Hello! How can I help you today?"
123   }
124 }