Settings | Deepgram's Docs

Voice Agent

The Settings message is a JSON command that serves as an initialization step, setting up both the behavior of the voice agent.

Purpose

The Settings message is an initialization command that establishes both the behavior of the voice agent and the audio transmission formats before voice data is exchanged. The client should send a Settings message immediately after opening the websocket and before sending any audio.

For a detailed explanation of all the options available for the Settings message, see our documentation on how to Configure the Voice Agent.

Example Payloads

This example uses a very basic Settings to establish a connection. To send the Settings message, you need to send the following JSON message to the server:

JSON

1 {
2 "type": "Settings",
3 "tags": ["demo", "voice_agent"],
4 "audio": {
5   "input": {
6     "encoding": "linear16",
7     "sample_rate": 24000
8   },
9   "output": {
10     "encoding": "linear16",
11     "sample_rate": 24000,
12     "container": "none"
13     // ... additional output options: bitrate
14   }
15 },
16 "agent": {
17   "language": "en",
18   "listen": {
19     "provider": {
20       "type": "deepgram",
21       "model": "nova-3",
22       "smart_format": false // Deepgram providers only
23       // ... additional provider options: keyterms (nova-3 'en' only)
24     }
25   },
26   "think": {
27     "provider": {
28       "type": "open_ai",
29       "model": "gpt-4o-mini",
30       "temperature": 0.7
31       // Optional if LLM provider is open_ai or anthropic. Required for 3rd party LLM providers such as google and groq
32     },
33     // ... additional think options: prompt, context_length, functions, endpoint
34   },
35   "speak": {
36     "provider": {
37       "type": "deepgram",
38       "model": "aura-2-thalia-en"
39       // Examples for other providers:
40       // "type": "open_ai", "model": "tts-1", "voice": "alloy"
41       // "type": "eleven_labs", "model_id": "eleven_monolingual_v1", "language_code": "en-US"
42       // "type": "cartesia", "model_id": "sonic-2", "voice": {"mode": "id", "id": "voice-id"}, "language": "en"
43       // "type": "aws_polly", "voice": "Matthew", "language_code": "en-US", "engine": "standard", "credentials": {...}
44     }
45     // ... additional speak options: endpoint (required for non-deepgram providers)
46   }
47   // ... additional agent options: context, greeting
48 }
49 // ... additional top-level options: experimental
50 }

Upon receiving the Settings message, the server will process all remaining audio data and return the following SettingsApplied message.

JSON

1 {
2     "type": "SettingsApplied"
3 }