Changelog

March 16, 2026

🤖 New LLM Models Support & Bug Fixes

We’ve added support for new LLM models in the Voice Agent API:

OpenAI GPT-5.3 Instant (gpt-5.3-chat-latest)
OpenAI GPT 5.4 (gpt-5.4)
Google Gemini 3.1 Flash Lite (gemini-3.1-flash-lite)

Example:

1 {
2   "type": "Settings",
3   "agent": {
4     "think": {
5       "provider": {
6         "type": "open_ai",
7         "model": "gpt-5.3-chat-latest"
8       }
9     }
10   }
11 }

For the full list of supported models and pricing tiers, visit our Voice Agent LLM Models documentation.

Fixes

Resolves an issue where the GPT-5.2 Instant model used an incorrect model ID and pricing tier. The model now uses the correct ID (gpt-5.2-chat-latest) and is assigned to the Advanced tier.

March 10, 2026

Nova-3 Model Update

🎯 Nova-3 Swedish and Dutch Model Enhancements

We’ve released updated Nova-3 Swedish and Nova-3 Dutch models, offering improved accuracy for both streaming and batch transcription.

Access these models by setting model: "nova-3" and the relevant language code:

Swedish (sv, sv-SE)
Dutch (nl)

Learn more about Nova-3 on the Models and Language Overview page.

March 9, 2026

Reasoning mode for OpenAI thinking models

You can now control the reasoning effort of supported OpenAI reasoning models using the new reasoning_mode parameter in the think provider configuration. This parameter maps to OpenAI’s reasoning_effort and accepts low, medium, or high.

Example:

1 {
2   "type": "Settings",
3   "agent": {
4     "think": {
5       "provider": {
6         "type": "open_ai",
7         "model": "gpt-5",
8         "reasoning_mode": "medium"
9       }
10     }
11   }
12 }

For more details, visit the Configure the Voice Agent documentation.

March 5, 2026

Model Improvement Program pricing update

Pay as you Go and Growth customers can now opt in or out of the Model Improvement Program with no impact on the rates listed on deepgram.com/pricing.

March 5, 2026

Deepgram Self-Hosted March 2026 Release (260305)

We are aware of an issue with Flux in this release. Do not use this release for Flux deployments.

Container Images (release 260305)

quay.io/deepgram/self-hosted-api:release-260305
- Equivalent image to:
  - quay.io/deepgram/self-hosted-api:1.179.5
quay.io/deepgram/self-hosted-engine:release-260305
- Equivalent image to:
  - quay.io/deepgram/self-hosted-engine:3.113.2
- Minimum required NVIDIA driver version: >=570.172.08
quay.io/deepgram/self-hosted-license-proxy:release-260305
- Equivalent image to:
  - quay.io/deepgram/self-hosted-license-proxy:1.10.1
quay.io/deepgram/self-hosted-billing:release-260305
- Equivalent image to:
  - quay.io/deepgram/self-hosted-billing:1.12.1

This Release Contains The Following Changes

Nova-3 Right-to-Left Language Support — Nova-3 now supports Arabic, Hebrew, Farsi, and Urdu. See the full announcement for details.
Nova-3 Multilingual Model Update — Accuracy improvements across all supported languages, with the largest gains in code-switching scenarios. See the full announcement for details.
Abbreviated Dates in Smart Formatting — Smart formatting now recognizes and formats abbreviated dates.
General Improvements — Keeps our software up-to-date.

February 27, 2026

Flux: Configure Control Message for Dynamic Mid-Stream Configuration

Flux now supports dynamic configuration updates mid-stream with the new Configure control message, enabling voice agents to adapt speech recognition behavior as conversations evolve—without disconnecting and reconnecting.

🎯 Dynamic Configuration Updates

Real conversations aren’t static. They shift from casual confirmation to strict authentication, from general discussion to domain-specific troubleshooting. The Configure control message lets you inject the right ASR context at the right moment in the conversation.

Update these parameters mid-stream:

Keyterms - Inject task-critical vocabulary as conversation context changes (names, medications, product terminology)
Turn detection thresholds (eot_threshold, eager_eot_threshold) - Tighten or relax turn detection confidence requirements
Timeout behavior (eot_timeout_ms) - Adjust silence tolerance for different conversation phases

Why This Matters

Context injection for speech recognition. Add a customer’s name to keyterms right before asking for it. Swap to medication terminology when moving to pharmacy discussion. Load product names when handling specific inquiries. You’re no longer stuck with a generic keyterm list that’s “good enough” for the entire call.

Adaptive turn detection. When collecting passwords, OTPs, or account numbers, increase eot_timeout_ms and eot_threshold to prevent premature cutoffs. Relax them when you’re back to natural conversation.

Simplified engineering. No more reconnecting mid-call (dropping audio, managing state transitions) or managing multiple concurrent streams. One connection, dynamic behavior.

Implementation

Send a Configure control message over your Flux WebSocket connection:

1 {
2   "type": "Configure",
3   "thresholds": {
4     "eot_threshold": 0.8,
5     "eot_timeout_ms": 5000
6   },
7   "keyterms": ["product_name", "feature_name"]
8 }

You’ll receive a ConfigureSuccess response confirming the update, and changes apply immediately in the audio stream order.

Availability: Configure is currently available via direct WebSocket connections only. SDK and self-hosted support are coming soon.

Important: Keyterms are replaced (not merged) when you send a Configure message. Include both existing and new terms if you want to add to the list.

For comprehensive documentation, examples in multiple languages, and detailed behavior specifications, see the Configure Control Message documentation.

For the full API specification, see the Flux API Reference.

February 25, 2026

UpdateThink: Replace Think Provider Mid-Conversation

You can now dynamically replace the entire Think provider configuration during a live Voice Agent conversation using the new UpdateThink client message. This allows you to switch to a different LLM provider, change the model, set a completely new prompt, and reconfigure functions — all in a single message.

Unlike UpdatePrompt, which adds to the existing prompt, UpdateThink replaces the full Think provider configuration, giving you complete control over the agent’s reasoning setup mid-conversation.

Example:

1 {
2     "type": "UpdateThink",
3     "think": {
4         "provider": {
5             "type": "open_ai",
6             "model": "gpt-4o-mini"
7         },
8         "prompt": "You are a helpful voice assistant."
9     }
10 }

The server confirms the change with a ThinkUpdated response message.

For more details, visit the UpdateThink documentation.

February 12, 2026

Deepgram Self-Hosted February 2026 Release (260212)

Container Images (release 260212)

quay.io/deepgram/self-hosted-api:release-260212
- Equivalent image to:
  - quay.io/deepgram/self-hosted-api:1.177.3
quay.io/deepgram/self-hosted-engine:release-260212
- Equivalent image to:
  - quay.io/deepgram/self-hosted-engine:3.107.0-1
- Minimum required NVIDIA driver version: >=570.172.08
quay.io/deepgram/self-hosted-license-proxy:release-260212
- Equivalent image to:
  - quay.io/deepgram/self-hosted-license-proxy:1.9.2
quay.io/deepgram/self-hosted-billing:release-260212
- Equivalent image to:
  - quay.io/deepgram/self-hosted-billing:1.12.1

This Release Contains The Following Changes

General Improvements — Keeps our software up-to-date.

February 11, 2026

New Default Concurrency Limits

We’re increasing default concurrency limits by up to 3X for Streaming Speech to Text, Text to Speech, and Voice Agent for Pay as you Go, Growth, and Enterprise plans.

For full details on the rate limits for your plan, see the API Rate Limits documentation.

February 6, 2026

🤖 New OpenAI & Gemini LLM Models Support

We’ve added support for new LLM models in our Voice Agent API!

Available Models:

OpenAI GPT 5.2 Instant (gpt-5.2-instant)
OpenAI GPT 5.2 Thinking (gpt-5.2)
Google Gemini 3 Flash (gemini-3-flash-preview)

Implementation: Configure your chosen model in your Voice Agent settings:

1 {
2   "type": "Settings",
3   "agent": {
4     "think": {
5       "provider": {
6         "type": "open_ai",
7         "model": "gpt-5.2-instant"
8       }
9     }
10   },
11   ... # other config
12 }

For complete information about supported LLMs including the new models, visit our Voice Agent LLM Models documentation.