March 31, 2025

Container Images (release 250331)

  • quay.io/deepgram/self-hosted-api:release-250331

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-api:1.141.0
      • quay.io/deepgram/onprem-api:release-250331
      • quay.io/deepgram/onprem-api:1.141.0
  • quay.io/deepgram/self-hosted-engine:release-250331

    • Equivalent image to:

      • quay.io/deepgram/self-hosted-engine:3.85.4
      • quay.io/deepgram/onprem-engine:release-250331
      • quay.io/deepgram/onprem-engine:3.85.4
    • Minimum required NVIDIA driver version: >=530.30.02

    • Maximum required NVIDIA driver version: <=561.00.00

  • quay.io/deepgram/self-hosted-license-proxy:release-250331

    • It is essential to upgrade the license proxy before upgrading the API or Engine for this release. Neglecting to upgrade the license proxy will cause a breaking change on deployment.
    • Equivalent image to:
      • quay.io/deepgram/self-hosted-license-proxy:1.8.0
      • quay.io/deepgram/onprem-license-proxy:release-250331
      • quay.io/deepgram/onprem-license-proxy:1.8.0
  • quay.io/deepgram/self-hosted-billing:release-250331

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-billing:1.11.2
      • quay.io/deepgram/onprem-billing:release-250331
      • quay.io/deepgram/onprem-billing:1.11.2

This Release Contains The Following Changes

  • Adds support for our new class of multilingual Nova-3 models.
    • Use model=nova-3&language=multi.
    • Contact your Deepgram account representative for access to Nova-3.
  • Significantly improves formatting for all transcripts, including in multilingual contexts. This new formatting is powered by a dedicated Named Entity Recognition (NER) model that performs sophisticated recognition of entities, including phone numbers, addresses, and dates, in order to return excellent formatted transcripts. NER is required for smart-formatted Nova-3 transcripts, and strongly suggested for all other speech-to-text transcripts.
    • Consult our instructions for enabling NER formatting to make required updates to configuration as well as models.
    • Improvements to smart-formatting have modified the formatting of some entity classes. Please contact your Deepgram account representative if you have any questions.
  • Improves streaming smart-formatting.
  • Improves non-English numeral formatting.
  • Resolves an issue with Simplified Chinese (zh/zh-CN) language requests.
  • Keeps our software up-to-date.

Reminder: The Deepgram image repositories have been updated to reflect our “self-hosted” naming. Images should now be pulled from the self-hosted-* Quay repositories. For the next six months, both onprem-* and self-hosted-* image repositories will receive identical image updates monthly, and we will announce image tags in the self-hosted repositories. Subsequently, we will only publish new images to self-hosted-* repos, deprecating onprem-* repository variants.


March 31, 2025

Nova-3 Multilingual General Availability - Real-Time Code-Switching

Deepgram is proud to announce the general availability of Nova-3 Multilingual, the first model of its kind able to codeswitch in real-time across 10 different languages. This revolutionary capability unlocks a host of new possibilities for global operations by processing multilingual conversations instantly with a single model—an industry-first breakthrough that changes the game for speech recognition.

Multilingual Support

  • Real-time multilingual speech recognition with a truly unified speech recognition system

  • Supports code-switching between 10 languages:

    • English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch
  • Seamlessly handles natural language transitions without relying on explicit routing or language-specific mechanisms

  • Maintains high transcription accuracy across languages while adapting to natural language transitions

  • Developed through a multi-stage training process combining synthetic code-switched data at massive scale with carefully curated real-world datasets

Use Cases

Nova-3 Multilingual represents a significant breakthrough for applications in:

  • Global customer support
  • Emergency response (e.g., 911 calls)
  • Multilingual meetings
  • Retail interactions
  • Healthcare settings

In high-stakes scenarios like emergency response, Nova-3 can fluidly process interactions where callers switch between languages (e.g., Spanish and English) in real time, ensuring dispatchers receive accurate, immediate transcriptions without missing critical details.

Availability

  • Now available through our API
  • Use model=nova-3&language=multi in your API calls
  • Supports both pre-recorded and real-time streaming transcription
  • Available for hosted and self-hosted use

For detailed information about Nova-3 Multilingual, please refer to our Developer Documentation.