December 29, 2025 | Deepgram's Docs

Deepgram Self-Hosted December 2025 Release (251229)

quay.io/deepgram/self-hosted-api:release-251229
- Equivalent image to:
  - quay.io/deepgram/self-hosted-api:1.173.4
quay.io/deepgram/self-hosted-engine:release-251229
- Equivalent image to:
  - quay.io/deepgram/self-hosted-engine:3.107.0
- Minimum required NVIDIA driver version: >=570.172.08
quay.io/deepgram/self-hosted-license-proxy:release-251229
- Equivalent image to:
  - quay.io/deepgram/self-hosted-license-proxy:1.9.2
quay.io/deepgram/self-hosted-billing:release-251229
- Equivalent image to:
  - quay.io/deepgram/self-hosted-billing:1.12.1

Expands Aura-2 TTS language support - Adds TTS support for Dutch, German, French, Italian, and Japanese. See the relevant changelog entry. Reach out to your Deepgram representative to obtain the new Aura-2 models.
Adds Engine metrics for Flux - Adds flux_max_streams, flux_used_streams, flux_fraction_streams, and flux_cursor_latency metrics to the Engine container for Flux monitoring and auto-scaling.
Adds PHI redaction category - Enables the use of redact=phi to redact six applicable sub-categories of PHI entities. See the related changelog entry for details.
Allows optional blocking on model pre-loading before Engine becomes ready - By default, models pre-load in the background, which can cause a delay on the first request. Setting blocking = true under [preload_models] in engine.toml makes the Engine wait until model pre-loading completes before accepting traffic. The tradeoff is longer startup time (potentially minutes), so orchestration and health checks should allow for a delayed readiness signal.
Includes General Improvements — Keeps our software up-to-date.