June 11, 2026
Deepgram Self-Hosted June 2026 Release (260611)
Container Images (release 260611)
-
quay.io/deepgram/self-hosted-api:release-260611- Equivalent image to:
quay.io/deepgram/self-hosted-api:1.191.0-1
- Equivalent image to:
-
quay.io/deepgram/self-hosted-engine:release-260611-
Equivalent image to:
quay.io/deepgram/self-hosted-engine:3.118.0-1
-
Minimum required NVIDIA driver version:
>=570.172.08
-
-
quay.io/deepgram/self-hosted-license-proxy:release-260611- Equivalent image to:
quay.io/deepgram/self-hosted-license-proxy:1.10.1-1
- Equivalent image to:
-
quay.io/deepgram/self-hosted-billing:release-260611- Equivalent image to:
quay.io/deepgram/self-hosted-billing:1.13.0
- Equivalent image to:
Action Required: Engine Container GPU Environment Variables
The Engine container change previewed in the May 28, 2026 release has shipped in this release. The Engine container now requires two environment variables to access the GPU:
NVIDIA_VISIBLE_DEVICES=allNVIDIA_DRIVER_CAPABILITIES=compute,utility
If they are not set, the Engine container will fail to start after you upgrade to release-260611. Follow the step for your deployment method before pulling the release-260611 Engine image:
-
Official Helm chart (
deepgram-self-hosted): Upgrade to chart version0.37.0or later. The chart sets both variables on the Engine pod automatically whenever a GPU is requested, so no manual change is needed. If you pin an older chart version, bump it as part of adopting this release. -
Deepgram-provided Docker or Podman Compose files: Pull the latest files from
deepgram/self-hosted-resources. They already set both variables on the Engine service. -
Your own deployment manifests: Add both variables to the Engine container’s environment. For example, in a Docker or Podman Compose file:
Or, for a Kubernetes Engine container spec:
After setting the variables, upgrade your image tags to release-260611 and restart the Engine. Confirm it reaches a healthy state (for example, GET /v1/status returns 200) before routing production traffic.
This Release Contains The Following Changes
- Persian Profanity Filtering —
profanity_filter=truenow masks recognized profanity in Persian (fa) transcripts. See Profanity Filtering for the supported language list and usage. - English Redaction on Flux Streaming —
redactnow applies to English transcripts on the Flux streaming endpoint (/v2/listen). - Streaming Diarization Model Selection — a new
diarize_modelparameter selects the diarization model on streaming requests; accepted values arev1andlatest, and setting it enables diarization (no separatediarize=truerequired). See Diarization for details. - Number Formatting Improvements — number formatting now covers Simplified Mandarin (
zh), Cantonese (zh-HK), and Bulgarian (bg). Across languages, ordinals written as numerals format more consistently, and indefinite articles (“a”/“an”) format as digits in quantity contexts. - Text-to-Speech Output Transcoding —
/v1/speaknow supports optional output transcoding to additional audio formats. - Aura-2 Numeric Pronunciation Fix — corrects a pronunciation issue on all-numeric inputs.
- Voice Agent Third-Party Provider Reliability — improves reliability of ElevenLabs streaming and Cartesia cancellation handling for self-hosted Voice Agent.
- General Improvements — Keeps our software up-to-date.