Nova-3 Model Update
🎯 Nova-3 Swedish and Dutch Model Enhancements
We’ve released updated Nova-3 Swedish and Nova-3 Dutch models, offering improved accuracy for both streaming and batch transcription.
Access these models by setting model: "nova-3" and the relevant language code:
- Swedish (
sv,sv-SE) - Dutch (
nl)
Learn more about Nova-3 on the Models and Language Overview page.
Reasoning mode for OpenAI thinking models
You can now control the reasoning effort of supported OpenAI reasoning models using the new reasoning_mode parameter in the think provider configuration. This parameter maps to OpenAI’s reasoning_effort and accepts low, medium, or high.
Example:
For more details, visit the Configure the Voice Agent documentation.
Model Improvement Program pricing update
Pay as you Go and Growth customers can now opt in or out of the Model Improvement Program with no impact on the rates listed on deepgram.com/pricing.
Deepgram Self-Hosted March 2026 Release (260305)
We are aware of an issue with Flux in this release. Do not use this release for Flux deployments.
Container Images (release 260305)
-
quay.io/deepgram/self-hosted-api:release-260305- Equivalent image to:
quay.io/deepgram/self-hosted-api:1.179.5
- Equivalent image to:
-
quay.io/deepgram/self-hosted-engine:release-260305-
Equivalent image to:
quay.io/deepgram/self-hosted-engine:3.113.2
-
Minimum required NVIDIA driver version:
>=570.172.08
-
-
quay.io/deepgram/self-hosted-license-proxy:release-260305- Equivalent image to:
quay.io/deepgram/self-hosted-license-proxy:1.10.1
- Equivalent image to:
-
quay.io/deepgram/self-hosted-billing:release-260305- Equivalent image to:
quay.io/deepgram/self-hosted-billing:1.12.1
- Equivalent image to:
This Release Contains The Following Changes
- Nova-3 Right-to-Left Language Support — Nova-3 now supports Arabic, Hebrew, Farsi, and Urdu. See the full announcement for details.
- Nova-3 Multilingual Model Update — Accuracy improvements across all supported languages, with the largest gains in code-switching scenarios. See the full announcement for details.
- Abbreviated Dates in Smart Formatting — Smart formatting now recognizes and formats abbreviated dates.
- General Improvements — Keeps our software up-to-date.
Flux: Configure Control Message for Dynamic Mid-Stream Configuration
Flux now supports dynamic configuration updates mid-stream with the new Configure control message, enabling voice agents to adapt speech recognition behavior as conversations evolve—without disconnecting and reconnecting.
🎯 Dynamic Configuration Updates
Real conversations aren’t static. They shift from casual confirmation to strict authentication, from general discussion to domain-specific troubleshooting. The Configure control message lets you inject the right ASR context at the right moment in the conversation.
Update these parameters mid-stream:
- Keyterms - Inject task-critical vocabulary as conversation context changes (names, medications, product terminology)
- Turn detection thresholds (
eot_threshold,eager_eot_threshold) - Tighten or relax turn detection confidence requirements - Timeout behavior (
eot_timeout_ms) - Adjust silence tolerance for different conversation phases
Why This Matters
Context injection for speech recognition. Add a customer’s name to keyterms right before asking for it. Swap to medication terminology when moving to pharmacy discussion. Load product names when handling specific inquiries. You’re no longer stuck with a generic keyterm list that’s “good enough” for the entire call.
Adaptive turn detection. When collecting passwords, OTPs, or account numbers, increase eot_timeout_ms and eot_threshold to prevent premature cutoffs. Relax them when you’re back to natural conversation.
Simplified engineering. No more reconnecting mid-call (dropping audio, managing state transitions) or managing multiple concurrent streams. One connection, dynamic behavior.
Implementation
Send a Configure control message over your Flux WebSocket connection:
You’ll receive a ConfigureSuccess response confirming the update, and changes apply immediately in the audio stream order.
Availability: Configure is currently available via direct WebSocket connections only. SDK and self-hosted support are coming soon.
Important: Keyterms are replaced (not merged) when you send a Configure message. Include both existing and new terms if you want to add to the list.
For comprehensive documentation, examples in multiple languages, and detailed behavior specifications, see the Configure Control Message documentation.
For the full API specification, see the Flux API Reference.
UpdateThink: Replace Think Provider Mid-Conversation
You can now dynamically replace the entire Think provider configuration during a live Voice Agent conversation using the new UpdateThink client message. This allows you to switch to a different LLM provider, change the model, set a completely new prompt, and reconfigure functions — all in a single message.
Unlike UpdatePrompt, which adds to the existing prompt, UpdateThink replaces the full Think provider configuration, giving you complete control over the agent’s reasoning setup mid-conversation.
Example:
The server confirms the change with a ThinkUpdated response message.
For more details, visit the UpdateThink documentation.
Deepgram Self-Hosted February 2026 Release (260212)
Container Images (release 260212)
-
quay.io/deepgram/self-hosted-api:release-260212- Equivalent image to:
quay.io/deepgram/self-hosted-api:1.177.3
- Equivalent image to:
-
quay.io/deepgram/self-hosted-engine:release-260212-
Equivalent image to:
quay.io/deepgram/self-hosted-engine:3.107.0-1
-
Minimum required NVIDIA driver version:
>=570.172.08
-
-
quay.io/deepgram/self-hosted-license-proxy:release-260212- Equivalent image to:
quay.io/deepgram/self-hosted-license-proxy:1.9.2
- Equivalent image to:
-
quay.io/deepgram/self-hosted-billing:release-260212- Equivalent image to:
quay.io/deepgram/self-hosted-billing:1.12.1
- Equivalent image to:
This Release Contains The Following Changes
- General Improvements — Keeps our software up-to-date.
New Default Concurrency Limits
We’re increasing default concurrency limits by up to 3X for Streaming Speech to Text, Text to Speech, and Voice Agent for Pay as you Go, Growth, and Enterprise plans.
For full details on the rate limits for your plan, see the API Rate Limits documentation.
🤖 New OpenAI & Gemini LLM Models Support
We’ve added support for new LLM models in our Voice Agent API!
Available Models:
- OpenAI GPT 5.2 Instant (gpt-5.2-instant)
- OpenAI GPT 5.2 Thinking (gpt-5.2)
- Google Gemini 3 Flash (gemini-3-flash-preview)
Implementation: Configure your chosen model in your Voice Agent settings:
For complete information about supported LLMs including the new models, visit our Voice Agent LLM Models documentation.
Nova-3 Multilingual Model Update
🌍 Nova-3 Multilingual Improvements
We’ve released an updated Nova-3 multilingual model, delivering accuracy improvements across supported languages, with the largest gains in code-switching scenarios.
This update focuses on improving real-world multilingual speech recognition, especially for inputs that mix languages within a single utterance or conversation.
Key improvements include:
- Lower Word Error Rate (WER) across both batch and streaming inference for all languages supported by the multilingual model
- Significantly improved code-switching handling, reducing word drops when languages are mixed
These improvements help developers build more reliable, natural multilingual voice experiences without changing APIs or configuration.
Learn more about Nova-3 Multilingual on the Models and Language Overview page.