To configure your Voice Agent, you’ll need to send a Settings message immediately after connection. This message configures the agent’s behavior, input/output audio formats, and various provider settings.
For more information on the Settings message, see the Voice Agent API Reference
Provider-specific guidance lives on the model pages, not here. For LLM model selection, fallback behavior, and managed-vs-BYO provider rules, see LLM Models. For TTS provider parameters and codeswitching voices, see TTS Models. For audio encoding choices, see Media Inputs & Outputs.
The Settings message is a JSON object that contains the following fields:
agent.listen.provider.language and agent.speak.provider.languageagent.listen.provider.language for the best recognition accuracy.multi in agent.listen.provider.language for flexible language support (Nova models), or use flux-general-multi with language_hints for Flux-based multilingual support.multi is only supported in agent.speak.provider.language with Eleven Labs TTS, OpenAI TTS, or Cartesia TTS.agent.listen.provider.modelflux-general-en, or flux-general-multi for multilingual support.flux-general-multi, set agent.listen.provider.language_hints to an array of BCP-47 language codes to bias toward expected languages. See Flux Multilingual & Language Prompting.agent.think.provider.reasoning_modereasoning_mode parameter maps to OpenAI’s reasoning_effort parameter.low, medium, or high. Higher values allow the model to spend more tokens reasoning before responding, which can improve accuracy on complex tasks.gpt-5, gpt-5-mini).agent.think.context_lengthmax will set the context length to the maximum allowed based on the LLM provider you use. If the total context exceeds the model’s maximum, truncation is handled by the LLM provider.agent.think.prompt.agent.contextagent.context object allows you to provide conversation history to the agent when starting a new session. This is useful for continuing conversations or providing background context.agent.context.messages array contains conversation history entries, which can be either conversational messages or function calls.{"type": "History", "role": "user" | "assistant", "content": "message text"}{"type": "History", "function_calls": [{"id": "unique_id", "name": "function_name", "client_side": true/false, "arguments": "json_string", "response": "response_text"}]}settings.flags.history to false in the Settings message.agent.listen.provider.eot_threshold and agent.listen.provider.eager_eot_thresholdThese parameters control Flux end-of-turn detection and are only available when using Flux models with the v2 API (agent.listen.provider.version set to v2).
eot_threshold sets the confidence required to trigger an EndOfTurn event. Higher values reduce false positives but increase latency. Defaults to 0.7.eager_eot_threshold enables eager end-of-turn detection, triggering EagerEndOfTurn events before the user fully finishes speaking. This reduces end-to-end latency but increases LLM calls. Must be less than or equal to eot_threshold.agent.listen.provider.smart_formatagent.listen.provider.smart_format setting is only available for Deepgram providers.true, Deepgram applies smart formatting to improve transcript readability.false, Deepgram does not apply smart formatting.false.smart_format.Below is an in-depth example showing all the available fields for Settings with all the optional fields for individual provider specific settings.