Container Images Release

Deepgram Self-Hosted November 2025 Release (251118)

Container Images (release 251118)

  • quay.io/deepgram/self-hosted-api:release-251118

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-api:1.169.0
  • quay.io/deepgram/self-hosted-engine:release-251118

    • Equivalent image to:

      • quay.io/deepgram/self-hosted-engine:3.104.10
    • Minimum required NVIDIA driver version: >=570.172.08

  • quay.io/deepgram/self-hosted-license-proxy:release-251118

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-license-proxy:1.9.2
  • quay.io/deepgram/self-hosted-billing:release-251118

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-billing:1.12.1

This Release Contains The Following Changes

  • Expands Nova-3 Monolingual Language Support — Nova-3 now supports 11 additional languages, bringing stronger accuracy and contextual understanding across:

    • Eastern Europe and Eurasia: Bulgarian (bg), Czech (cs), Hungarian (hu), Polish (pl), Russian (ru), Ukrainian (uk)
    • Nordics and Baltics: Finnish (fi)
    • South Asia: Hindi (hi)
    • East Asia: Japanese (ja), Korean (ko, ko-KR)
    • Southeast Asia: Vietnamese (vi)

    Learn more in our announcement blog.

  • Adds 36-Language Detection Model — Adds support for a new language detection model that handles 36 languages. This feature requires enabling the use_v2_language_detection feature flag in the Engine TOML configuration. Language detection is available for pre-recorded audio only. Learn more in the language detection documentation.

  • Updates Status Endpoint — Updates the /v1/status endpoint to better reflect node startup and runtime state, preventing false Critical reports when the API starts before an Engine driver is ready. See the status endpoint documentation for the new status flow:

    • Initializing — Reported during node startup; transitions to Ready once initialization completes.
    • Ready — The node can service requests; transitions to Healthy after enough successful requests, or Critical if errors occur.
    • Healthy — Sustained success; can transition to Critical if failures arise.
    • Critical — Indicates node failures; can recover back to Ready once node can service requests again.
  • Enhances API Graceful Shutdown — Resolves an issue where the API container would not properly wait for outstanding work to complete before shutting down. The graceful shutdown period now defaults to approximately 10 minutes.

  • Improves Address Formatting — Improves formatting for street numbers in addresses.

  • Improves Aura-2 Latency Consistency — Improves latency consistency for Aura-2 text-to-speech requests.

  • Deprecates Legacy Intelligence Features — Legacy Intelligence features (analyze_sentiment=true, detect_topics=true, summarize=v1, and summarize=true v1 structure) are now deprecated in favor of newer versions. Requests using these parameters will return HTTP 400 errors. Migration guidance:

    • analyze_sentiment=true → use sentiment=true
    • detect_topics=true → use topics=true
    • summarize=v1 → use summarize=true or summarize=v2

    See the Speech-to-Text changelog for more details.

  • Includes General Improvements — Keeps our software up-to-date.