July 31, 2025

July 31, 2025

Deepgram Self-Hosted July 2025 Release (250731)

Container Images (release 250731)

  • quay.io/deepgram/self-hosted-api:release-250731

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-api:1.154.1
      • quay.io/deepgram/onprem-api:release-250731
      • quay.io/deepgram/onprem-api:1.154.1
  • quay.io/deepgram/self-hosted-engine:release-250731

    • Equivalent image to:

      • quay.io/deepgram/self-hosted-engine:3.94.0
      • quay.io/deepgram/onprem-engine:release-250731
      • quay.io/deepgram/onprem-engine:3.94.0
    • Minimum required NVIDIA driver version: >=550.163.01

  • quay.io/deepgram/self-hosted-license-proxy:release-250731

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-license-proxy:1.8.0
      • quay.io/deepgram/onprem-license-proxy:release-250731
      • quay.io/deepgram/onprem-license-proxy:1.8.0
  • quay.io/deepgram/self-hosted-billing:release-250731

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-billing:1.11.2
      • quay.io/deepgram/onprem-billing:release-250731
      • quay.io/deepgram/onprem-billing:1.11.2

This Release Contains The Following Changes

  • Adds redact_usage functionality to redact the values of keyterms and other URL parameters. This is now enabled by default, and may be toggled via the redact_usage boolean feature flag in api.toml. See our redact usage documentation for more information.
  • Adds targeted support for CUDA 12.8. We recommend updating to the latest CUDA 12.8 release for optimal performance and stability. See our driver installation doc for guidance on updating to newer NVIDIA driver and CUDA toolkit versions.
  • Returns a 400 Unknown Model error with reference to our error documentation when NER is misconfigured.
  • Ensures presence of a word-level language tag on multilingual transcripts that apply smart formatting.
  • Improves smart formatting of dates, alphanumerics, and numbers with units.
  • Keeps our software up-to-date.

Reminder: The Deepgram image repositories have been updated to reflect our “self-hosted” naming. Images should now be pulled from the self-hosted-* Quay repositories. Through August 2025, both onprem-* and self-hosted-* image repositories will receive identical image updates monthly, and we will announce image tags in the self-hosted repositories. Starting in September 2025, we will only publish new images to self-hosted-* repos, deprecating onprem-* repository variants.


July 31, 2025

Voice Agent API

New Features

🎯 Smart Formatting for More Readable Conversations

We’ve added a new smart_format option to improve transcript readability in UI applications. This feature enables smart formatting for better user experience when displaying transcribed conversation.

Key Features:

  • Enhanced transcript formatting for UI applications
  • Defaults to false for backward compatibility

Implementation: Configure the smart_format option in your Voice Agent listen provider settings:

1{
2 "agent": {
3 "listen": {
4 "provider": {
5 "type": "deepgram",
6 "model": "nova-3",
7 "smart_format": true
8 }
9 }
10 }
11}

For complete implementation details, see our Voice Agent configuration documentation.

🔒 Model Improvement Program Opt-Out

Users can now opt out of our Model Improvement Program when using the Voice Agent API.

Implementation: Add mip_opt_out: true to your Settings message:

1{
2 "type": "Settings",
3 "mip_opt_out": true,
4 "agent": {
5 "listen": {
6 "provider": {
7 "type": "deepgram",
8 "model": "nova-3"
9 }
10 }
11 }
12}

For more information about the Model Improvement Program and opt-out options, visit our Model Improvement Partnership Program documentation.

🤖 Gemini LLM Support

We’ve added support for Google’s Gemini LLMs in our Voice Agent API! This expands our LLM options to include Google’s powerful language models.

Available Models:

  • Gemini 2.5 Flash
  • Gemini 2.0 Flash
  • Gemini 2.0 Flash Lite

For complete information about supported LLMs including Gemini models, visit our Voice Agent LLM Models documentation.