Container Images Release

Deepgram Self-Hosted December 2025 Release (251229)

Container Images (release 251229)

  • quay.io/deepgram/self-hosted-api:release-251229

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-api:1.173.4
  • quay.io/deepgram/self-hosted-engine:release-251229

    • Equivalent image to:

      • quay.io/deepgram/self-hosted-engine:3.107.0
    • Minimum required NVIDIA driver version: >=570.172.08

  • quay.io/deepgram/self-hosted-license-proxy:release-251229

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-license-proxy:1.9.2
  • quay.io/deepgram/self-hosted-billing:release-251229

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-billing:1.12.1

This Release Contains The Following Changes

  • Adds Engine metrics for Flux - Adds flux_max_streams, flux_used_streams, flux_fraction_streams, and flux_cursor_latency metrics to the Engine container for Flux monitoring and auto-scaling.

  • Adds PHI redaction category - Enables the use of redact=phi to redact six applicable sub-categories of PHI entities. See the related changelog entry for details.

  • Allows optional blocking on model pre-loading before Engine becomes ready - By default, models pre-load in the background, which can cause a delay on the first request. Setting blocking = true under [preload_models] in engine.toml makes the Engine wait until model pre-loading completes before accepting traffic. The tradeoff is longer startup time (potentially minutes), so orchestration and health checks should allow for a delayed readiness signal.

  • Includes General Improvements — Keeps our software up-to-date.