TTS Models

An overview of Text-to-Speech providers and models you can use with the Voice Agent API.

By default Deepgram Text-to-Speech will be used with the Voice Agent API, but if you opt to use another provider’s TTS model with your Agent, you can do so by applying the following settings.

You can set your Text-to-Speech model in the Settings Message for your Voice Agent. See the docs for more information.

Deepgram TTS models

For a complete list of Deepgram TTS models see TTS Voice Selection.

ParameterTypeDescription
agent.speak.provider.typeStringMust be deepgram
agent.speak.provider.modelStringThe TTS model to use

Example

JSON
1{
2 "speak": {
3 "provider": {
4 "type": "deepgram",
5 "model": "aura-2-thalia-en"
6 }
7 }
8}

Third Party TTS models

To use a third party TTS voice, specify the TTS provider and required parameters.

OpenAI

For OpenAI you can refer to this article on how to find your voice ID.

ParameterTypeDescription
agent.speak.provider.typeStringMust be open_ai
agent.speak.provider.modelStringThe TTS model to use
agent.speak.provider.voiceStringThe voice to use
agent.speak.endpointObjectRequired and must include url and headers
agent.speak.endpoint.urlStringYour OpenAI API endpoint URL
agent.speak.endpoint.headersObjectRequired headers for authentication

Example

1{
2 "agent": {
3 "speak": {
4 "provider": {
5 "type": "open_ai",
6 "model": "tts-1",
7 "voice": "alloy"
8 },
9 "endpoint": {
10 "url": "https://api.openai.com/v1/audio/speech",
11 "headers": {
12 "authorization": "Bearer {{OPENAI_API_KEY}}"
13 }
14 }
15 }
16 }
17}

Eleven Labs

For ElevenLabs you can refer to this article on how to find your Voice ID or use their API to retrieve it. We support any of ElevenLabs’ Turbo 2.5 voices to ensure low latency interactions.

ParameterTypeDescription
agent.speak.provider.typeStringMust be eleven_labs
agent.speak.provider.model_idStringThe model ID to use
agent.speak.provider.language_codeStringOptional Language code
agent.speak.endpointObjectRequired and must include url and headers
agent.speak.endpoint.urlStringYour Eleven Labs API endpoint URL
agent.speak.endpoint.headersObjectRequired headers for authentication

Example

1{
2 "agent": {
3 "speak": {
4 "provider": {
5 "type": "eleven_labs",
6 "model_id": "eleven_monolingual_v1",
7 "language_code": "en-US"
8 },
9 "endpoint": {
10 "url": "https://api.elevenlabs.io/v1/text-to-speech",
11 "headers": {
12 "xi-api-key": "{{ELEVEN_LABS_API_KEY}}"
13 }
14 }
15 }
16 }
17}

Cartesia

For Cartesia you can use their API to retrieve a voice ID.

ParameterTypeDescription
agent.speak.provider.typeStringMust be cartesia
agent.speak.provider.model_idStringThe model ID to use
agent.speak.provider.voiceObjectCartesia Voice configuration
agent.speak.provider.voice.modeStringThe voice mode to use
agent.speak.provider.voice.idStringThe voice ID to use
agent.speak.provider.languageStringOptional language setting
agent.speak.endpointObjectRequired and must include url and headers
agent.speak.endpoint.urlStringYour Cartesia API endpoint URL
agent.speak.endpoint.headersObjectRequired headers for authentication

Example

1{
2 "agent": {
3 "speak": {
4 "provider": {
5 "type": "cartesia",
6 "model_id": "cartesia-v1",
7 "voice": {
8 "mode": "premium",
9 "id": "cartesia-voice-1"
10 },
11 "language": "Language.EN"
12 },
13 "endpoint": {
14 "url": "https://api.cartesia.ai/v1/tts",
15 "headers": {
16 "authorization": "Bearer {{CARTESIA_API_KEY}}"
17 }
18 }
19 }
20 }
21}

AWS Polly

For AWS Polly you can refer to this article for a list of available voices.

If no engine is specified, AWS Polly defaults to Standard. If the chosen voice doesn’t support Standard, you’ll get an error like: “Standard engine not supported for {voice}.” In that case, you must explicitly specify the correct engine.

ParameterTypeDescription
agent.speak.provider.typeStringMust be aws_polly
agent.speak.provider.language_codeStringThe language code to use
agent.speak.provider.voiceStringThe voice to use
agent.speak.provider.engineStringThe engine to use
agent.speak.provider.credentialsObjectThe credentials to use

STS Example

1{
2 "agent": {
3 "speak": {
4 "provider": {
5 "type": "aws_polly",
6 "language_code": "en-US",
7 "voice": "Matthew",
8 "engine": "standard",
9 "credentials": {
10 "type": "STS",
11 "region": "us-west-2",
12 "access_key_id": "{{AWS_ACCESS_KEY_ID}}",
13 "secret_access_key": "{{AWS_SECRET_ACCESS_KEY}}",
14 "session_token": "{{AWS_SESSION_TOKEN}}"
15 }
16 },
17 "endpoint": {
18 "url": "https://polly.us-west-2.amazonaws.com/v1/speech"
19 }
20 }
21 }
22}

IAM Example

1{
2 "agent": {
3 "speak": {
4 "provider": {
5 "type": "aws_polly",
6 "voice": "Joanna",
7 "language_code": "en-US",
8 "engine": "standard",
9 "credentials": {
10 "type": "IAM",
11 "region": "us-east-2",
12 "access_key_id": "{{AWS_ACCESS_KEY_ID}}",
13 "secret_access_key": "{{AWS_SECRET_ACCESS_KEY}}"
14 }
15 },
16 "endpoint": {
17 "url": "https://polly.us-east-2.amazonaws.com/v1/speech"
18 }
19 }
20 }
21}

What’s Next