TTS Models
An overview of Text-to-Speech providers and models you can use with the Voice Agent API.
By default Deepgram Text-to-Speech will be used with the Voice Agent API. You can also use Deepgram’s native Cartesia support or opt to use another provider’s TTS model with your Agent by applying the following settings.
You can set your Text-to-Speech model in the Settings Message for your Voice Agent. See the docs for more information.
Deepgram TTS models
For a complete list of Deepgram TTS models see TTS Voice Selection.
Example
Deepgram-managed Cartesia TTS models
Deepgram also provides managed support for Cartesia TTS. For a complete list of Cartesia TTS models, visit Cartesia’s TTS Docs. Cartesia is included in the Standard pricing tier.
Example
BYO Third Party TTS models
To use a third party TTS voice, specify the TTS provider and required parameters.
OpenAI
For OpenAI you can refer to this article on how to find your voice ID.
Example
Eleven Labs
For ElevenLabs you can refer to this article on how to find your Voice ID or use their API to retrieve it. See their TTS Docs for more information.
We support any of ElevenLabs’ Turbo 2.5 voices to ensure low latency interactions
Example
Cartesia
For Cartesia you can use their API to retrieve a voice ID. See their TTS API Docs for more information.
Example
Amazon (AWS) Polly
For Amazon (AWS) Polly you can refer to this article for a list of available voices.
If no engine is specified, Amazon (AWS) Polly defaults to Standard. If the chosen voice doesn’t support Standard, you’ll get an error like: “Standard engine not supported for {voice}.” In that case, you must explicitly specify the correct engine.
STS Example
IAM Example
Using Multiple TTS Providers
If you need to set a fallback TTS provider, you can define multiple TTS providers for your Voice Agent. The speak object supports both a single provider and an array of providers.
Example
What’s Next