Deepgram API Playground & Console Improvements

Quickstarts

Use-case specific templates are now available on the API Playground. The prefilled editable configs can be used to test and quickly get started on our TTS & Voice Agent APIs.

Quickstart templates

Voice Agent

All LLMs supported from agent models endpoint are now displayed in the voice agent playground.

LLM in Voice Agent

Usage Charts & Logs

Additional filtering by date range, time and groups are now available in Console. The ability to export usage data to a csv. is also now supported in the console.

Usage Filters

Deepgram Saga Improvements

Enhanced MCP Tool Visibility

You now have complete visibility into which tools are loaded under each server, making it easier to understand your what tools are available for use.

MCP Tool Visibility

Improved Action Processing Feedback

Say goodbye to wondering what’s happening behind the scenes. Our new action processing text provides clear, real-time feedback on executed actions with status messages like “Creating an email draft…” so you always know what’s in progress.

Action Processing Feedback

Learn more about Deepgram Saga here!

Voice Agent API

New Features

🤖 Expanded LLM Support

We’ve significantly expanded our LLM options across our two pricing tiers. The following LLMs are now available for use:

Standard Models

  • OpenAI GPT-4.1 mini
  • OpenAI GPT-4.1 nano
  • OpenAI GPT-4o mini
  • Anthropic Claude Haiku 3.5

Advanced Models

  • OpenAI GPT-4.1
  • OpenAI GPT-4o
  • Anthropic Claude Sonnet 4

For complete information about supported LLMs, visit our Voice Agent LLM Models documentation or try them out in our API Playground.

🌍 Spanish Language Support

Voice Agents now support Spanish conversations with the addition of Aura-2 Spanish TTS. Configure your agent’s language settings to enable Spanish voice interactions.

See our Voice Agent API documentation for implementation details.

💬 Conversation Context Feature

Introducing comprehensive conversation continuity with our new context feature:

Complete Context Awareness

  • Agents maintain conversation history across sessions
  • Seamless continuation of previous interactions

Enhanced User Experience

  • More natural conversations with historical context
  • Consistent interaction patterns across sessions

Flexible Implementation

  • Support for both conversational and function call history
  • Configurable history settings
Implementation

Use the agent.context object to provide conversation history when starting new sessions:

Conversational Messages:

1{
2 "type": "History",
3 "role": "user" | "assistant",
4 "content": "message text"
5}

Function Call Messages:

1{
2 "type": "History",
3 "function_calls": [{
4 "id": "unique_id",
5 "name": "function_name",
6 "client_side": true/false,
7 "arguments": "json_string",
8 "response": "response_text"
9 }]
10}

To disable function call history, set settings.flags.history to false in the Settings message.

Documentation

🔍 Enhanced Error Visibility

We’ve improved client-side visibility of LLM and TTS errors to provide better debugging and user experience.

Deepgram Self-Hosted July 2025 Release (250710)

Container Images (release 250710)

  • quay.io/deepgram/self-hosted-api:release-250710

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-api:1.151.8
      • quay.io/deepgram/onprem-api:release-250710
      • quay.io/deepgram/onprem-api:1.151.8
  • quay.io/deepgram/self-hosted-engine:release-250710

    • Equivalent image to:

      • quay.io/deepgram/self-hosted-engine:3.91.0
      • quay.io/deepgram/onprem-engine:release-250710
      • quay.io/deepgram/onprem-engine:3.91.0
    • Minimum required NVIDIA driver version: >=550.163.01

  • quay.io/deepgram/self-hosted-license-proxy:release-250710

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-license-proxy:1.8.0
      • quay.io/deepgram/onprem-license-proxy:release-250710
      • quay.io/deepgram/onprem-license-proxy:1.8.0
  • quay.io/deepgram/self-hosted-billing:release-250710

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-billing:1.11.2
      • quay.io/deepgram/onprem-billing:release-250710
      • quay.io/deepgram/onprem-billing:1.11.2

This Release Contains The Following Changes

  • Adds more verbose logging in Voice Agent for failures in TTS, LLM, and function-calling.
  • Improves redaction accuracy around entities and punctuation.
  • Keeps our software up-to-date.

Reminder: The Deepgram image repositories have been updated to reflect our “self-hosted” naming. Images should now be pulled from the self-hosted-* Quay repositories. Through August 2025, both onprem-* and self-hosted-* image repositories will receive identical image updates monthly, and we will announce image tags in the self-hosted repositories. Starting in September 2025, we will only publish new images to self-hosted-* repos, deprecating onprem-* repository variants.

Container Images (release 250626)

  • quay.io/deepgram/self-hosted-api:release-250626

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-api:1.150.2
      • quay.io/deepgram/onprem-api:release-250626
      • quay.io/deepgram/onprem-api:1.150.2
  • quay.io/deepgram/self-hosted-engine:release-250626

    • Equivalent image to:

      • quay.io/deepgram/self-hosted-engine:3.91.0
      • quay.io/deepgram/onprem-engine:release-250626
      • quay.io/deepgram/onprem-engine:3.91.0
    • Minimum required NVIDIA driver version: >=550.163.01

  • quay.io/deepgram/self-hosted-license-proxy:release-250626

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-license-proxy:1.8.0
      • quay.io/deepgram/onprem-license-proxy:release-250626
      • quay.io/deepgram/onprem-license-proxy:1.8.0
  • quay.io/deepgram/self-hosted-billing:release-250626

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-billing:1.11.2
      • quay.io/deepgram/onprem-billing:release-250626
      • quay.io/deepgram/onprem-billing:1.11.2

This Release Contains The Following Changes

  • Improves smart formatting for emails, alphanumerics, quantities, and percentages.
  • Expands language support for profanity filtering in German, Swiss German, Polish, Portuguese, Spanish, and Swedish.
  • Resolves an edge case when handling certain corrupt audio and now returns an HTTP 400 error code.
  • Keeps our software up-to-date.

Reminder: The Deepgram image repositories have been updated to reflect our “self-hosted” naming. Images should now be pulled from the self-hosted-* Quay repositories. For the next three months, both onprem-* and self-hosted-* image repositories will receive identical image updates monthly, and we will announce image tags in the self-hosted repositories. Subsequently, we will only publish new images to self-hosted-* repos, deprecating onprem-* repository variants.

Deepgram is excited to announce Aura‑2 (text to speech) Spanish voices, empowering real-time voice applications with a new, high-fidelity Spanish option for enterprise use.

Voice Quality & Features

  • Launching with 10 distinct Spanish voices, each tuned for specific business contexts.
  • Spanish voices are optimized for pacing, intonation, and emphasis suitable for professional interactions—from customer support to healthcare use cases.
  • Superior pronunciation accuracy for domain-specific content:
  • Currency (e.g., ”€”, “pesos”)
  • Dates/timestamps in various formats
  • Acronyms and alphanumeric IDs
  • Email addresses, passwords and URLs
  • Spanish-language proper nouns 

Availability

  • Aura-2 Spanish (es) is available now via REST and Websocket APIs
  • Current available for use through our hosted offering with self-hosted support coming soon

For detailed information, please refer to our Developer Documentation

Profanity Filtering Gets Expanded Language Support

Profanity filtering now supports 6 additional languages beyond English, giving you content moderation capabilities across your global user base. Available on monolingual models for:

Newly Supported:

  • German (de)
  • Swiss German (de-CH)
  • Polish (pl)
  • Portuguese (pt, pt-BR, pt-PT)
  • Spanish (es, es-419)
  • Swedish (sv, sv-SE)

Existing English Support:

  • en, en-US, en-AU, en-GB, en-NZ, en-IN

This expansion lets you deploy consistent content policies across international markets without building custom filtering logic.

Smart Formatting Improvements

We’ve resolved several high-impact formatting edge cases that were causing transcription accuracy issues in production environments:

Improved Entity Formatting via Smart Format

Email Transcription Improvements

  • Fixed: 'o' characters in email addresses now transcribe correctly instead of converting to '0'
  • Fixed: edge case email mentions that were being dropped entirely in specific batch processing scenarios

Certain formerly numeric-only sequences have been updated to correctly preserve all alphanumeric characters:

  • Before (some entities): "my account number is a b c d zero nine""my account number is 09"
  • After (some entities): "my account number is a b c d zero nine""my account number is ABCD09"

Quantity modifiers (‘single’, ‘double’, ‘triple’ + standalone character or number) are better handled via Smart Format:

  • Before (some entities): "double 2""2"
  • After (some entities): "double 2""22"

Special cases of ‘hundred’ or ‘a hundred’ now supported via Smart Format:

  • Before (some entities): "hundred percent""%"
  • After (some entities): "hundred percent""100%"

This update has gone out to all hosted streaming transcription, and will be applied to our next self-hosted release later this month.

Container Images (release 250610)

  • quay.io/deepgram/self-hosted-api:release-250610

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-api:1.146.1
      • quay.io/deepgram/onprem-api:release-250610
      • quay.io/deepgram/onprem-api:1.146.1
  • quay.io/deepgram/self-hosted-engine:release-250610

    • Equivalent image to:

      • quay.io/deepgram/self-hosted-engine:3.89.2
      • quay.io/deepgram/onprem-engine:release-250610
      • quay.io/deepgram/onprem-engine:3.89.2
    • Minimum required NVIDIA driver version: >=550.163.01

  • quay.io/deepgram/self-hosted-license-proxy:release-250610

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-license-proxy:1.8.0
      • quay.io/deepgram/onprem-license-proxy:release-250610
      • quay.io/deepgram/onprem-license-proxy:1.8.0
  • quay.io/deepgram/self-hosted-billing:release-250610

    • Equivalent image to:
      • quay.io/deepgram/self-hosted-billing:1.11.2
      • quay.io/deepgram/onprem-billing:release-250610
      • quay.io/deepgram/onprem-billing:1.11.2

This Release Contains The Following Changes

  • Adds full support for Voice Agent v1 API.
  • Addresses an issue with the engine_active_requests metric for streaming STT auto-scaling.
  • Resolves an issue with year formatting for smart_format.
  • Keeps our software up-to-date.

Reminder: The Deepgram image repositories have been updated to reflect our “self-hosted” naming. Images should now be pulled from the self-hosted-* Quay repositories. For the next four months, both onprem-* and self-hosted-* image repositories will receive identical image updates monthly, and we will announce image tags in the self-hosted repositories. Subsequently, we will only publish new images to self-hosted-* repos, deprecating onprem-* repository variants.

We’ve just released a significant upgrade to Nova-3 Medical Streaming, bringing substantial improvements in accuracy for real-time medical transcription use cases. This update focuses specifically on our streaming model, delivering better performance across key transcription metrics.

Performance Improvements

  • **11% relative reduction in Overall WER **compared to Nova-3 general streaming model
  • **30% relative reduction in Overall WER **compared to Nova-2 Medical streaming model
  • **2.7x improvement in Keyword Recall Rate (KRR) **compared to Nova-3 general streaming model
  • Maintains industry-leading inference speed with ultra-low latency for real-time healthcare applications

Availability

The updated Nova-3 Medical streaming model is now available through our API. To access:

  • Use model=nova-3-medical in your streaming API calls
  • Available for hosted use
  • Self-hosted deployments will be available in subsequent releases
  • English only

For details on the original Nova-3 Medical release (including batch capabilities), check out the original changelog: Introducing Nova-3 Medical. For detailed information about Nova-3 Medical, please refer to our Developer Documentation.

Deepgram is proud to announce the release of Aura-2, our text-to-speech model purpose-built for realtime enterprise use cases.

Performance

  • Sub-200ms time-to-first-byte (TTFB) latency for real-time conversational interactions
  • 0.111x Real-Time Factor (RTF), synthesizing one second of audio in just over 100 milliseconds

Voice Quality & Features

  • Enterprise-optimized voice catalog with 40+ distinct voices, each designed for specific business contexts

  • Tuned for professional and transactional interactions with appropriate tone, pacing, and emphasis

  • Superior pronunciation accuracy for domain-specific content:

    • Currency and numerals
    • Dates and timestamps in varied formats
    • Email addresses, passwords, and URLs
    • Complex addresses and location references
  • Industry-leading voice clarity rated higher than competitors in customer service scenarios

Availability

  • Aura-2 is available now via REST and Websocket APIs
  • Currently available for use through our hosted offering

For detailed information about Aura-2, please refer to our Developer Documentation.