Voice Agent

November 12, 2025

Use Deepgram’s Managed Cartesia TTS Models

We’re excited to announce an easier way to use Cartesia’s Text-to-Speech models inside Deepgram’s voice agent – Deepgram-managed Cartesia models.

Similar to our managed LLM’s, simply specify Cartesia as your model provider and the correct model name to get started instantly. No Cartesia account creation, setup, or payments are required – this feature is included as part of Deepgram’s Standard pricing tier.

For detailed information, please refer to our TTS documentation.

🤖 Claude Haiku 4.5 LLM Support

We’ve added support for Anthropic’s new Claude Haiku model in our Voice Agent API!

Implementation:

Configure your chosen model in your Voice Agent settings:

1 {
2   "type": "Settings",
3   "agent": {
4     "think": {
5       "provider": {
6         "type": "anthropic",
7         "model": "claude-4-5-haiku-latest"
8       }
9     }
10   }
11 }

For complete information about supported LLMs including Claude Haiku 4.5, visit our Voice Agent LLM Models documentation

September 4, 2025

Increased Voice Agent Rate Limits for Pay as you Go and Growth Plans

Deepgram is excited to announce 3x increased rate limits for Voice Agent services on Pay-as-You-Go and Growth plans, enabling higher concurrent usage for your applications at no additional charge.

For detailed information about all rate limits, please refer to our API Rate Limits documentation.

August 8, 2025

Voice Agent API

New Features

🤖 GPT 5.0 LLM Support

We’ve added support for OpenAI’s new 5.0 models in our Voice Agent API!

Available Models:

5.0 (gpt-5)
5.0 Mini (gpt-5-mini)
5.0 Nano (gpt-5-nano)

Implementation: Configure your chosen model in your Voice Agent settings:

1 {
2   "type": "Settings",
3   "agent": {
4     "think": {
5       "provider": {
6         "type": "open_ai",
7         "model": "gpt-5"
8       }
9     }
10   }
11 }

For complete information about supported LLMs including the GPT 5.0 models, visit our Voice Agent LLM Models documentation.

August 6, 2025

Voice Agent API

New Features

🤖 GPT-OSS-20B LLM Support

We’ve added support for OpenAI’s first open source LLM, gpt-oss-20b, in our Voice Agent API!

Available Models:

GPT OSS 20B

Implementation: Configure GPT-OSS-20B in your Voice Agent settings:

1 {
2   "type": "Settings",
3   "agent": {
4     "think": {
5       "provider": {
6         "type": "groq",
7         "model": "openai/gpt-oss-20b"
8       }
9     }
10   }
11 }

For complete information about supported LLMs including GPT OSS 20B, visit our Voice Agent LLM Models documentation.

July 31, 2025

Voice Agent API

New Features

🎯 Smart Formatting for More Readable Conversations

We’ve added a new smart_format option to improve transcript readability in UI applications. This feature enables smart formatting for better user experience when displaying transcribed conversation.

Key Features:

Enhanced transcript formatting for UI applications
Defaults to false for backward compatibility

Implementation: Configure the smart_format option in your Voice Agent listen provider settings:

1 {
2   "agent": {
3     "listen": {
4       "provider": {
5         "type": "deepgram",
6         "model": "nova-3",
7         "smart_format": true
8       }
9     }
10   }
11 }

For complete implementation details, see our Voice Agent configuration documentation.

🔒 Model Improvement Program Opt-Out

Users can now opt out of our Model Improvement Program when using the Voice Agent API.

Implementation: Add mip_opt_out: true to your Settings message:

1 {
2   "type": "Settings",
3   "mip_opt_out": true,
4   "agent": {
5     "listen": {
6       "provider": {
7         "type": "deepgram",
8         "model": "nova-3"
9       }
10     }
11   }
12 }

For more information about the Model Improvement Program and opt-out options, visit our Model Improvement Partnership Program documentation.

🤖 Gemini LLM Support

We’ve added support for Google’s Gemini LLMs in our Voice Agent API! This expands our LLM options to include Google’s powerful language models.

Available Models:

Gemini 2.5 Flash
Gemini 2.0 Flash
Gemini 2.0 Flash Lite

For complete information about supported LLMs including Gemini models, visit our Voice Agent LLM Models documentation.

July 11, 2025

Voice Agent API

New Features

🤖 Expanded LLM Support

We’ve significantly expanded our LLM options across our two pricing tiers. The following LLMs are now available for use:

Standard Models

OpenAI GPT-4.1 mini
OpenAI GPT-4.1 nano
OpenAI GPT-4o mini
Anthropic Claude Haiku 3.5

Advanced Models

OpenAI GPT-4.1
OpenAI GPT-4o
Anthropic Claude Sonnet 4

For complete information about supported LLMs, visit our Voice Agent LLM Models documentation or try them out in our API Playground.

🌍 Spanish Language Support

Voice Agents now support Spanish conversations with the addition of Aura-2 Spanish TTS. Configure your agent’s language settings to enable Spanish voice interactions.

See our Voice Agent API documentation for implementation details.

💬 Conversation Context Feature

Introducing comprehensive conversation continuity with our new context feature:

Complete Context Awareness

Agents maintain conversation history across sessions
Seamless continuation of previous interactions

Enhanced User Experience

More natural conversations with historical context
Consistent interaction patterns across sessions

Flexible Implementation

Support for both conversational and function call history
Configurable history settings

Implementation

Use the agent.context object to provide conversation history when starting new sessions:

Conversational Messages:

1 {
2   "type": "History",
3   "role": "user" | "assistant", 
4   "content": "message text"
5 }

Function Call Messages:

1 {
2   "type": "History",
3   "function_calls": [{
4     "id": "unique_id",
5     "name": "function_name",
6     "client_side": true/false,
7     "arguments": "json_string",
8     "response": "response_text"
9   }]
10 }

To disable function call history, set settings.flags.history to false in the Settings message.

Documentation

Configure Voice Agent Context

🔍 Enhanced Error Visibility

We’ve improved client-side visibility of LLM and TTS errors to provide better debugging and user experience.