Changelog

April 3, 2026

NVIDIA LLM provider now available

NVIDIA is now a supported LLM provider for the Voice Agent API. Two models are available in the Standard pricing tier:

llama-nemotron-super-49B — Llama Nemotron Super 49B delivers high accuracy for multi-agentic reasoning.
nemotron-3-nano-30B-A3B — Nemotron 3 Nano 30B A3B provides cost efficiency with high accuracy for targeted agentic tasks.

Set the provider type to nvidia in your agent configuration:

1 {
2   "agent": {
3     "think": {
4       "provider": {
5         "type": "nvidia",
6         "model": "llama-nemotron-super-49B",
7         "temperature": 0.7
8       }
9     }
10   }
11 }

NVIDIA is a managed provider, so the endpoint field is optional. For the full list of supported models and pricing tiers, see the Voice Agent LLM Models documentation.

April 2, 2026

Deepgram Self-Hosted April 2026 Release (260402)

Container Images (release 260402)

quay.io/deepgram/self-hosted-api:release-260402
- Equivalent image to:
  - quay.io/deepgram/self-hosted-api:1.181.3
quay.io/deepgram/self-hosted-engine:release-260402
- Equivalent image to:
  - quay.io/deepgram/self-hosted-engine:3.114.5
- Minimum required NVIDIA driver version: >=570.172.08
quay.io/deepgram/self-hosted-license-proxy:release-260402
- Equivalent image to:
  - quay.io/deepgram/self-hosted-license-proxy:1.10.1
quay.io/deepgram/self-hosted-billing:release-260402
- Equivalent image to:
  - quay.io/deepgram/self-hosted-billing:1.13.0

This Release Contains The Following Changes

Certificate Endpoint Fix — Engine now responds to /v1/certificates in addition to /certificates, consistent with the other container images. See Certificate Status for details.
Model Name Consistency — The /v1/models endpoint now returns a canonical_name field matching the model name used in /v1/listen requests.
General Improvements — Keeps our software up-to-date.

April 1, 2026

New `thought_signature` field for Gemini function calling

The Voice Agent API now includes an optional thought_signature field in function call messages. Some Gemini models (3.0 and 3.1 families) require this as an additional function call identifier.

This field appears in two places:

Settings message — in agent.context.messages[].function_calls[] when providing function call history
FunctionCallRequest — in functions[] when the server requests a function call

Example

1 {
2   "type": "FunctionCallRequest",
3   "functions": [
4     {
5       "id": "fc_12345678-90ab-cdef-1234-567890abcdef",
6       "name": "get_weather",
7       "arguments": "{\"location\": \"Fremont, CA 94539\"}",
8       "client_side": true,
9       "thought_signature": "abc123"
10     }
11   ]
12 }

The thought_signature field is optional and only relevant when using Google Gemini models. This change addresses the degraded function calling performance that some users experienced with the Gemini 3.0 and 3.1 model families.

For more details, see the Function Call Request documentation, the Voice Agent API Reference, or Gemini’s Thought Signatures Documentation.

New `volume` parameter for Cartesia TTS

The Voice Agent API now supports an optional agent.speak.provider.volume parameter when using Cartesia as the TTS provider. Valid values range from 0.5 to 2.0.

For more details, see Configure the Voice Agent or the Cartesia volume, speed, and emotion documentation.

March 31, 2026

Nova-3 Model Update

🌏 Nova-3 now supports the following new languages and language codes:

Chinese (Mandarin, Simplified): zh, zh-CN, zh-Hans
Chinese (Mandarin, Traditional): zh-TW, zh-Hant

Access these models by setting model="nova-3" and the relevant language code in your request.

Learn more about Nova-3 and supported languages on the Models and Language Overview page.

March 26, 2026

TTS speed controls & updated LLM models

TTS speak speed (Early Access)

You can now control the speaking rate of Deepgram TTS in the Voice Agent API using the agent.speak.provider.speed parameter. This parameter accepts a float value between 0.7 and 1.5, with 1.0 as the default.

1 {
2   "type": "Settings",
3   "agent": {
4     "speak": {
5       "provider": {
6         "type": "deepgram",
7         "model": "aura-2-thalia-en",
8         "speed": 0.9
9       }
10     }
11   }
12 }

This feature is in Early Access and is only available for Deepgram TTS. For more details, see TTS voice controls. To request access, contact your Account Executive or reach out to sales@deepgram.com.

Updated LLM models

New OpenAI models — Two new models are now available in the Standard pricing tier:

gpt-5.4-nano
gpt-5.4-mini

Gemini 2.0 Flash deprecated — The gemini-2.0-flash model is now deprecated. We recommend migrating to gemini-2.5-flash or a newer Gemini model. See the Google models table for alternatives.

For the full list of supported models and pricing tiers, see the Voice Agent LLM Models documentation.

March 19, 2026

Deepgram Self-Hosted March 2026 Release (260319)

Container Images (release 260319)

quay.io/deepgram/self-hosted-api:release-260319
- Equivalent image to:
  - quay.io/deepgram/self-hosted-api:1.180.1
quay.io/deepgram/self-hosted-engine:release-260319
- Equivalent image to:
  - quay.io/deepgram/self-hosted-engine:3.114.4
- Minimum required NVIDIA driver version: >=570.172.08
quay.io/deepgram/self-hosted-license-proxy:release-260319
- Equivalent image to:
  - quay.io/deepgram/self-hosted-license-proxy:1.10.1
quay.io/deepgram/self-hosted-billing:release-260319
- Equivalent image to:
  - quay.io/deepgram/self-hosted-billing:1.13.0

This Release Contains The Following Changes

Flux Regression Fix — Resolves Flux support regression from the 260305 release. See Deploy Flux Model (STT) for deployment details.
Nova-3 Language Expansion — New models: Thai (th, th-TH), Chinese Cantonese Traditional (zh-HK). Improved models: Bengali (bn), Marathi (mr), Tamil (ta), Telugu (te). See the full announcement for details.
Flux Status Metrics — Self-hosted status endpoint now includes Flux stream metrics. See Status Endpoint for details.
Certificate Status Endpoint — New /v1/certificates endpoint on all container images returns beginning-of-support, end-of-support, and end-of-life dates. See Certificate Status for details.
Log Formats — New configurable log output formats: Full, Compact, Pretty, Json. See Log Formats for configuration details.
General Improvements — Keeps our software up-to-date.

March 17, 2026

Nova-3 Model Update

🌏 Nova-3 now supports the following new languages and language codes:

Chinese (Cantonese, Traditional): zh-HK
Thai: th, th-TH

🚀 Also releasing improved Nova-3 models for the following languages:

Bengali (bn)
Marathi (mr)
Tamil (ta)
Telugu (te)

Access these models by setting model="nova-3" and the relevant language code in your request.

Learn more about Nova-3 and supported languages on the Models and Language Overview page.

March 16, 2026

🤖 New LLM Models Support & Bug Fixes

We’ve added support for new LLM models in the Voice Agent API:

OpenAI GPT-5.3 Instant (gpt-5.3-chat-latest)
OpenAI GPT 5.4 (gpt-5.4)
Google Gemini 3.1 Flash Lite (gemini-3.1-flash-lite-preview)

Example:

1 {
2   "type": "Settings",
3   "agent": {
4     "think": {
5       "provider": {
6         "type": "open_ai",
7         "model": "gpt-5.3-chat-latest"
8       }
9     }
10   }
11 }

For the full list of supported models and pricing tiers, visit our Voice Agent LLM Models documentation.

Fixes

Resolves an issue where the GPT-5.2 Instant model used an incorrect model ID and pricing tier. The model now uses the correct ID (gpt-5.2-chat-latest) and is assigned to the Advanced tier.

March 10, 2026

Nova-3 Model Update

🎯 Nova-3 Swedish and Dutch Model Enhancements

We’ve released updated Nova-3 Swedish and Nova-3 Dutch models, offering improved accuracy for both streaming and batch transcription.

Access these models by setting model: "nova-3" and the relevant language code:

Swedish (sv, sv-SE)
Dutch (nl)

Learn more about Nova-3 on the Models and Language Overview page.

March 9, 2026

Reasoning mode for OpenAI thinking models

You can now control the reasoning effort of supported OpenAI reasoning models using the new reasoning_mode parameter in the think provider configuration. This parameter maps to OpenAI’s reasoning_effort and accepts low, medium, or high.

Example:

1 {
2   "type": "Settings",
3   "agent": {
4     "think": {
5       "provider": {
6         "type": "open_ai",
7         "model": "gpt-5",
8         "reasoning_mode": "medium"
9       }
10     }
11   }
12 }

For more details, visit the Configure the Voice Agent documentation.

NVIDIA LLM provider now available

Deepgram Self-Hosted April 2026 Release (260402)

Container Images (release 260402)

This Release Contains The Following Changes

New thought_signature field for Gemini function calling

Example

New volume parameter for Cartesia TTS

Nova-3 Model Update

🌏 Nova-3 now supports the following new languages and language codes:

TTS speed controls & updated LLM models

TTS speak speed (Early Access)

Updated LLM models

Deepgram Self-Hosted March 2026 Release (260319)

Container Images (release 260319)

This Release Contains The Following Changes

Nova-3 Model Update

🌏 Nova-3 now supports the following new languages and language codes:

🚀 Also releasing improved Nova-3 models for the following languages:

🤖 New LLM Models Support & Bug Fixes

Fixes

Nova-3 Model Update

🎯 Nova-3 Swedish and Dutch Model Enhancements

Reasoning mode for OpenAI thinking models

New `thought_signature` field for Gemini function calling

New `volume` parameter for Cartesia TTS