Voice Agent TTS Controls
Apply Aura-2 speed, pronunciation, and pacing controls inside Voice Agent sessions.
Apply Aura-2 speed, pronunciation, and pacing controls inside Voice Agent sessions.
If you’re building with the Voice Agent API, Aura-2’s TTS voice controls — speed, pronunciation, and pacing — work inside your agent pipeline. Where you apply each control depends on what it does and what context the decision needs.
Speed is a session-level setting on the agent’s speak provider. Configure it when you initialize the agent, and every response from the agent uses that rate.
speed accepts a float between 0.7 and 1.5 (default 1.0). For Spanish voices the recommended range is 0.9–1.5; values below 0.9 may introduce disfluencies. See TTS Models for the full parameter reference and TTS Voice Controls for the underlying behavior.
A consistent session-level speed is useful for agents that serve accessibility-sensitive audiences, or any conversation where pacing should stay steady throughout the call.
The speed parameter is also supported for Cartesia TTS in Voice Agent sessions. See Deepgram-managed Cartesia TTS models for the accepted values.
Pronunciation overrides and pause cues are most effective when the LLM produces them — not when they’re added downstream — because both depend on the meaning of the surrounding text.
...) produce longer ones, and digits separated by periods slow down readback for phone numbers, account numbers, and IDs. Asking the LLM to produce well-punctuated output is more reliable than post-processing a flat string. See Text to Speech Prompting for the full set of pacing techniques Aura-2 supports.Put your pronunciation map and pacing rules in the system prompt and the Voice Agent passes the LLM’s output through to Aura-2 unchanged.
This keeps your pronunciation map and pacing rules in the LLM layer, not in a separate lexicon or orchestration config. To add a term, edit the prompt — no redeploy required.
For the full pronunciation override syntax, validation rules, and IPA sourcing tips, see TTS Voice Controls. For pause and pacing techniques, see Text to Speech Prompting and Formatting Text for Aura-2.