Multiple LLM Provider Support
We’ve added new functionality that allows users to specify multiple LLM providers for your Voice Agent, ensuring your agent will automatically fallback to another provider should you experience any issues. The think object supports both a single provider and an array of providers. LLM providers will be used in the order that you specify them.
For more details, visit our Voice Agent Multiple LLM Models documentation
🤖 New LLM Models Support
We’ve added support for new LLM models in our Voice Agent API!
Available Models:
- OpenAI GPT 5.1 Chat (gpt-5.1-chat-latest)
- OpenAI GPT 5.1 (gpt-5.1)
- Anthropic Claude Sonnet 4.5 (claude-sonnet-4-5
- Google Gemini 3 (gemini-3-pro-preview)
Implementation: Configure your chosen model in your Voice Agent settings:
For complete information about supported LLMs including the new models, visit our Voice Agent LLM Models documentation.
Container Images Release
Deepgram Self-Hosted January 2026 Release (260115)
Container Images (release 260115)
-
quay.io/deepgram/self-hosted-api:release-260115- Equivalent image to:
quay.io/deepgram/self-hosted-api:1.176.0
- Equivalent image to:
-
quay.io/deepgram/self-hosted-engine:release-260115-
Equivalent image to:
quay.io/deepgram/self-hosted-engine:3.107.0-1
-
Minimum required NVIDIA driver version:
>=570.172.08
-
-
quay.io/deepgram/self-hosted-license-proxy:release-260115- Equivalent image to:
quay.io/deepgram/self-hosted-license-proxy:1.9.2
- Equivalent image to:
-
quay.io/deepgram/self-hosted-billing:release-260115- Equivalent image to:
quay.io/deepgram/self-hosted-billing:1.12.1
- Equivalent image to:
January 2026 Self-Hosted Release: Update Recommendation
In Deepgram’s January 2026 self-hosted release (release-260115), we added new functionality to improve TTS response times from our API and Engine containers.
Due to this product change, the January 2026 self-hosted release is not backwards-compatible with previous releases when used to serve TTS traffic. It is a breaking change in how the API and Engine containers communicate with each other. To avoid any downtime in your self-hosted deployment, the updated version of the Engine node (3.107.0-1) must be running in advance of the updated version of the API node (1.176.0) serving requests. Note that the new version of the Engine (3.107.0-1) is compatible with previous versions of the API, so the Engine container must be deployed before the API container. The blue-green deployment strategy is one possible deployment strategy, but there are others that satisfy the requirement that the Engine container is deployed first. This is only applicable for deployments serving TTS traffic. The breaking change is not relevant to deployments serving STT traffic.
The License Proxy node is not impacted by breaking changes, but in the context of a complete Deepgram self-hosted deployment, it is most cohesive to also include the update to the License Proxy node (1.9.2) in the blue-green deployment.
This Release Contains The Following Changes
- Improves Transcription of “Um” in Portuguese — Monolingual Portuguese STT now transcribes “um” (meaning “one”) as a non-filler word, and “um” is included in Portuguese transcripts, even when the
filler_wordsfeature is disabled. - General Improvements — Keeps our software up-to-date
Flux: WebM Container Support Added
Flux now supports the WebM container format with Opus codec, providing seamless compatibility with audio sources that output WebM-formatted audio streams.
WebM Container Support
Flux now accepts WebM containers with Opus codec encoding:
- WebM containers with
opusencoding
When sending WebM containerized audio, omit the encoding and sample_rate parameters—Flux will automatically detect these from the container metadata.
Why This Matters
WebM is commonly used in web applications and streaming scenarios. This addition makes it easier to integrate Flux with audio sources that natively output WebM format, eliminating the need for format conversion.
Implementation
For WebM containerized audio:
For detailed information about all supported Flux audio formats, see our Flux documentation.
Container Images Release
Deepgram Self-Hosted December 2025 Release (251229)
Container Images (release 251229)
-
quay.io/deepgram/self-hosted-api:release-251229- Equivalent image to:
quay.io/deepgram/self-hosted-api:1.173.4
- Equivalent image to:
-
quay.io/deepgram/self-hosted-engine:release-251229-
Equivalent image to:
quay.io/deepgram/self-hosted-engine:3.107.0
-
Minimum required NVIDIA driver version:
>=570.172.08
-
-
quay.io/deepgram/self-hosted-license-proxy:release-251229- Equivalent image to:
quay.io/deepgram/self-hosted-license-proxy:1.9.2
- Equivalent image to:
-
quay.io/deepgram/self-hosted-billing:release-251229- Equivalent image to:
quay.io/deepgram/self-hosted-billing:1.12.1
- Equivalent image to:
This Release Contains The Following Changes
-
Expands Aura-2 TTS language support - Adds TTS support for Dutch, German, French, Italian, and Japanese. See the relevant changelog entry. Reach out to your Deepgram representative to obtain the new Aura-2 models.
-
Adds Engine metrics for Flux - Adds
flux_max_streams,flux_used_streams,flux_fraction_streams, andflux_cursor_latencymetrics to the Engine container for Flux monitoring and auto-scaling. -
Adds PHI redaction category - Enables the use of
redact=phito redact six applicable sub-categories of PHI entities. See the related changelog entry for details. -
Allows optional blocking on model pre-loading before Engine becomes ready - By default, models pre-load in the background, which can cause a delay on the first request. Setting
blocking = trueunder[preload_models]in engine.toml makes the Engine wait until model pre-loading completes before accepting traffic. The tradeoff is longer startup time (potentially minutes), so orchestration and health checks should allow for a delayed readiness signal. -
Includes General Improvements — Keeps our software up-to-date.
Aura-2 TTS Language Expansion
Deepgram has expanded Aura-2 (Text-to-Speech) to support the following languages:
- Dutch
- German
- French
- Italian
- Japanese
Additionally, new voices have been added to the Spanish (es) model.
The expanded voice catalog spans genders, age groups, and speaking styles, supporting a wide range of enterprise use cases including customer service, healthcare, sales, interviews, and IVR.
You can explore all available voices, including featured voices, in the Voices & Languages section of our documentation and try them live in the Deepgram Playground.
PHI Redaction Now Available for Batch and Streaming Speech-to-Text
We’re excited to announce that PHI (Protected Health Information) redaction is now available for both batch (pre-recorded) and streaming speech-to-text.
redact=phi
You can now redact protected health information using the new phi parameter, which redacts the following entity types: condition, drug, injury, blood_type, medical_process, and statistics.
Key features:
- Batch support: Available for all pre-recorded audio transcription
- Streaming support: Available for real-time streaming transcription
- Language support: Follows existing redaction language support (all languages for hosted batch, English only for streaming)
- Combine with other redaction options: Use multiple redaction parameters together (e.g.,
redact=phi&redact=pci)
Example usage:
For detailed information, see our Redaction documentation and supported entity types.
Container Images Release
Deepgram Self-Hosted December 2025 Release (251210)
Container Images (release 251210)
-
quay.io/deepgram/self-hosted-api:release-251210- Equivalent image to:
quay.io/deepgram/self-hosted-api:1.172.2
- Equivalent image to:
-
quay.io/deepgram/self-hosted-engine:release-251210-
Equivalent image to:
quay.io/deepgram/self-hosted-engine:3.104.10
-
Minimum required NVIDIA driver version:
>=570.172.08
-
-
quay.io/deepgram/self-hosted-license-proxy:release-251210- Equivalent image to:
quay.io/deepgram/self-hosted-license-proxy:1.9.2
- Equivalent image to:
-
quay.io/deepgram/self-hosted-billing:release-251210- Equivalent image to:
quay.io/deepgram/self-hosted-billing:1.12.1
- Equivalent image to:
This Release Contains The Following Changes
-
Expands Nova-3 with 10 New Languages — Building on the 11-language expansion from the 251118 release, Nova-3 now supports 31 total languages. This release adds 10 additional languages, bringing improved accuracy and contextual understanding across:
- Southern and Eastern Europe: Greek (el), Romanian (ro), Slovak (sk), Catalan (ca)
- Northern and Baltic Europe: Lithuanian (lt), Latvian (lv), Estonian (et), Flemish (nl-BE), Swiss German (de-CH)
- Southeast Asia: Malay (ms)
Learn more in our announcement blogs: 10 new languages and previous 11-language expansion.
-
Adds Multilingual Keyterm Prompting for Nova-3 Multi — Nova-3 multilingual now supports multilingual keyterm prompting, allowing you to pass up to 500 tokens (~100 words) to boost recognition of brand names, industry jargon, proper nouns, and other mission-critical vocabulary across multilingual audio.
This feature requires loading a newer version of the Nova-3 multilingual model. If you attempt to use keyterm prompting with an older version of the Nova-3 multilingual model, you will receive an error:
Bad Request: The selected Nova-3 model does not support keyterm prompting. Contact Deepgram support for assistance with updating your model version.Learn more in the keyterm prompting documentation.
-
Improves Entity Formatting — Improves formatting for several entity types, including URLs and numeric entities that contain the word “thousand”.
-
Includes General Improvements — Keeps our software up-to-date.
EU Endpoint Now Generally Available
The Deepgram EU endpoint (api.eu.deepgram.com) is now generally available for customers requiring data processing within the European Union.
Supported APIs
The EU endpoint supports the following Deepgram APIs:
- Speech-to-Text:
/v1/listenand/v2/listen(excluding Whisper models) - Text-to-Speech:
/v1/speak - Voice Agent:
/v1/agent/converse - Text Intelligence:
/v1/read
Configuration
To use the EU endpoint, simply replace api.deepgram.com with api.eu.deepgram.com in your SDK or API requests. Your existing API keys and tokens will work with the EU endpoint.
For detailed configuration instructions and SDK examples, see our Configuring Custom Endpoints documentation.
Nova-3 Multilingual Now Supports Keyterm Prompting
Keyterm Prompting has been expanded to include the Nova-3 multilingual model. Previously, this feature was only available for monolingual Nova-3 models — now you can use keyterms with both.
To enable it, simply use: model=nova-3&language=multi and include your keyterm list to boost recognition of domain-specific vocabulary such as brand names, proper nouns, and industry-specific terms.
For more details, see the Keyterm Prompting page.
Nova-3 Model Update
🎯 Nova-3 supports 10 new languages
We’ve added support for 10 new languages with non-English monolingual Nova-3 models. This continues our effort to significantly expand Nova-3 language support beyond English. The newly supported languages and their corresponding language codes are:
Newly Supported:
- Catalan (
ca) - Estonian (
et) - Flemish (
nl-BE) - German (Switzerland) (
de-CH) - Greek (
el) - Latvian (
lv) - Lithuanian (
lt) - Malay (
ms) - Romanian (
ro) - Slovak (
sk)
Learn more about Nova-3 on the Models and Language Overview page.