Aura-2 Spanish Language Support
Deepgram is excited to announce Aura‑2 (text to speech) Spanish voices, empowering real-time voice applications with a new, high-fidelity Spanish option for enterprise use.
Voice Quality & Features
- Launching with 10 distinct Spanish voices, each tuned for specific business contexts.
- Spanish voices are optimized for pacing, intonation, and emphasis suitable for professional interactions—from customer support to healthcare use cases.
- Superior pronunciation accuracy for domain-specific content:
- Currency (e.g., ”€”, “pesos”)
- Dates/timestamps in various formats
- Acronyms and alphanumeric IDs
- Email addresses, passwords and URLs
- Spanish-language proper nouns
Availability
- Aura-2 Spanish (es) is available now via REST and Websocket APIs
- Current available for use through our hosted offering with self-hosted support coming soon
For detailed information, please refer to our Developer Documentation
Aura-2 TTS Now Available!
Deepgram is proud to announce the release of Aura-2, our text-to-speech model purpose-built for realtime enterprise use cases.
Performance
- Sub-200ms time-to-first-byte (TTFB) latency for real-time conversational interactions
- 0.111x Real-Time Factor (RTF), synthesizing one second of audio in just over 100 milliseconds
Voice Quality & Features
-
Enterprise-optimized voice catalog with 40+ distinct voices, each designed for specific business contexts
-
Tuned for professional and transactional interactions with appropriate tone, pacing, and emphasis
-
Superior pronunciation accuracy for domain-specific content:
- Currency and numerals
- Dates and timestamps in varied formats
- Email addresses, passwords, and URLs
- Complex addresses and location references
-
Industry-leading voice clarity rated higher than competitors in customer service scenarios
Availability
- Aura-2 is available now via REST and Websocket APIs
- Currently available for use through our hosted offering
For detailed information about Aura-2, please refer to our Developer Documentation.
Aura TTS Websocket Support
Deepgram is excited to announce the launch of our WebSocket API for our text-to-speech product, Aura. We’ve listened to users’ pain points when building conversational AI agents with LLMs and TTS (text-to-speech).
With the WebSocket API, you can minimize latency by sending text tokens from any LLM as soon as they’re generated, reducing delays and creating a smoother experience. It also makes handling user interruptions easier and provides a simple way to estimate the number of concurrent conversations you can support, with one WebSocket per conversation.
To learn more, check out our getting started guide: https://developers.deepgram.com/docs/streaming-text-to-speech
Aura-1 TTS Ringing Background Noise Improvements
Deepgram Aura text-to-speech has resolved user-reported issues with faint ringing background noise. With the new model version, ringing background noise issues have been resolved for all voices.
Please check out our voice selection page to see the voices that we offer:
Aura-1 TTS Improvements
Deepgram Aura text-to-speech has resolved user-reported issues, including:
Quotation Mark Handling
- You can now include quotation marks (” ”) in the text-input, and it correctly pronounces words.
Empty Audio Outputs
- Previously, our TTS API in rare occasions produced empty audio outputs despite non-empty text-inputs. This issue has been resolved.
To learn more about how to get started with text-to-speech, check out our guide in docs.
Aura-1 TTS Improved Volume Output
We’ve improved the consistency of volume output levels across all voices of our Aura text-to-speech model. Now, users will see:
- Consistent volume levels across all of our model voices
- Consistent volume levels across clips generated from a single model voice
To learn more about Aura text-to-speech check out our voices and getting started guide.