Voice Agent API Migration Guide

This guide helps developers migrate from the early access version of the Deepgram Voice Agent API to the official V1 release.

The Deepgram API Spec and Voice Agent API Reference have more details on the new Voice Agent API.

Endpoint Changes

Early Access	V1
`wss://agent.deepgram.com/agent`	`wss://agent.deepgram.com/v1/agent/converse`

Message Type Changes

Removed Message Types

The following message types from early access have been removed in V1:

Message Type	Description
`UpdateInstructions`	Now handled through the more flexible `Settings` structure
`FunctionCalling`	Function calling status is now handled differently

New Message Types

Here is a list of all-new message types in V1:

Message Type	Description
`PromptUpdated`	Confirmation that a prompt update has been applied
`SpeakUpdated`	Confirmation that a speak configuration update has been applied
`Warning`	Non-fatal errors or warnings
`AgentThinking`	Notification that the agent is thinking
`UserStartedSpeaking`	Notification that the user has started speaking

`Welcome` Message Changes

The welcome message has had the session_id field renamed to request_id to better align with other products.

Early Access: `Welcome`

1 {
2   "type": "Welcome",
3   "session_id": "fc553ec9-5874-49ca-a47c-b670d525a4b1"
4 }

V1: `Welcome`

1 {
2   "type": "Welcome",
3   "request_id": "fc553ec9-5874-49ca-a47c-b670d525a4b1"
4 }

`SettingsConfiguration` Becomes `Settings`

The most significant change is to the configuration message:

Early Access: `SettingsConfiguration`

1 {
2   "type": "SettingsConfiguration",
3   "audio": {
4     "input": { "encoding": "linear16", "sample_rate": 16000 },
5     "output": { "encoding": "linear16", "sample_rate": 24000 }
6   },
7   "agent": {
8     "instructions": "You are a helpful AI assistant. Keep responses concise.",
9     "listen_model": "nova",
10     "think_model": "gpt-4",
11     "speak_model": "aura"
12   }
13 }

V1: `Settings`

1 {
2   "type": "Settings",
3   "audio": {
4     "input": { "encoding": "linear16", "sample_rate": 16000 },
5     "output": { "encoding": "linear16", "sample_rate": 24000 }
6   },
7   "agent": {
8     "listen": { "provider": { "model": "nova-3" } },
9     "think": {
10       "provider": { "model": "gpt-4o-mini" },
11       "prompt": "You are a helpful AI assistant. Keep responses concise."
12     },
13     "speak": { "provider": { "model": "aura-2-andromeda-en" } }
14   }
15 }

For more details on all the possible settings available in the new Settings message, check out the Configure the Voice Agent guide.

Key differences:

Message type changed from SettingsConfiguration to Settings
Added fields: mip_opt_out and experimental
Introduced provider-based structure for listen, think, and speak capabilities
instructions field renamed to prompt in the think configuration
Added container field to audio output configuration
Added optional greeting field
Added support for custom endpoints via the endpoint object for non-Deepgram providers

`UpdateSpeak` Changes

The UpdateSpeak message has been restructured to use the provider pattern:

Early Access: `UpdateSpeak`

1 {
2   "type": "UpdateSpeak",
3   "model": "aura-asteria-en"
4 }

V1: `UpdateSpeak`

1 {
2   "type": "UpdateSpeak",
3   "speak": {
4     "provider": {
5       "type": "deepgram",
6       "model": "aura-2-thalia-en"
7     }
8   }
9 }

`InjectAgentMessage` Changes

The InjectAgentMessage message has a field rename:

Early Access: `InjectAgentMessage`

1 {
2   "type": "InjectAgentMessage",
3   "message": "I apologize, but I need to correct my previous statement..."
4 }

V1: `InjectAgentMessage`

1 {
2   "type": "InjectAgentMessage",
3   "content": "I apologize, but I need to correct my previous statement..."
4 }

Function Calling Changes

The function calling interface has significant changes:

Early Access: `FunctionCallRequest`

1 {
2   "type": "FunctionCallRequest",
3   "function_name": "get_weather",
4   "function_call_id": "fc_12345678-90ab-cdef-1234-567890abcdef",
5   "input": {
6     "location": "Fremont, CA 94539"
7   }
8 }

V1: `FunctionCallRequest`

1 {
2   "type": "FunctionCallRequest",
3   "functions": [
4     {
5       "id": "fc_12345678-90ab-cdef-1234-567890abcdef",
6       "name": "get_weather",
7       "arguments": "{\"location\": \"Fremont, CA 94539\"}",
8       "client_side": true
9     }
10   ]
11 }

Early Access: `FunctionCallResponse`

1 {
2   "type": "FunctionCallResponse",
3   "function_call_id": "fc_12345678-90ab-cdef-1234-567890abcdef",
4   "output": "{\"location\": \"Fremont, CA 94539\", \"temperature_c\": 21, \"condition\": \"Sunny\", \"humidity\": 40, \"wind_kph\": 14}"
5 }

V1: `FunctionCallResponse`

1 {
2   "type": "FunctionCallResponse",
3   "id": "fc_12345678-90ab-cdef-1234-567890abcdef",
4   "name": "get_weather",
5   "content": "{\"location\": \"Fremont, CA 94539\", \"temperature_c\": 21, \"condition\": \"Sunny\", \"humidity\": 40, \"wind_kph\": 14}"
6 }

`Error` Response Changes

The Error message structure has been updated:

Early Access: `Error`

1 {
2   "type": "Error",
3   "message": "Failed to process audio input: Invalid audio format"
4 }

V1: `Error`

1 {
2   "type": "Error",
3   "description": "Failed to process audio input: Invalid audio format",
4   "code": "INVALID_AUDIO_FORMAT"
5 }

Function Call Handling in V1

The function calling system in V1 has been significantly improved with a clearer client-side vs. internal server-side execution model.

FunctionCallRequest

In V1, the FunctionCallRequest message includes a client_side flag that explicitly indicates where the function should be executed:

1 {
2   "type": "FunctionCallRequest",
3   "functions": [
4     {
5       "id": "fc_12345678-90ab-cdef-1234-567890abcdef",
6       "name": "get_weather",
7       "arguments": "{\"location\": \"Fremont, CA 94539\"}",
8       "client_side": true
9     }
10   ]
11 }

When handling a FunctionCallRequest:

Check the client_side flag in each function
If client_side is true, your client code must:
- Execute the specified function with the provided arguments
- Send a FunctionCallResponse back to the server
If client_side is false, no client action is needed as the server will handle it internally

FunctionCallResponse

The FunctionCallResponse message has been updated to include the function name and uses clearer field names:

1 {
2   "type": "FunctionCallResponse",
3   "id": "fc_12345678-90ab-cdef-1234-567890abcdef",
4   "name": "get_weather",
5   "content": "{\"location\": \"Fremont, CA 94539\", \"temperature_c\": 21, \"condition\": \"Sunny\", \"humidity\": 40, \"wind_kph\": 14}"
6 }

Key points about FunctionCallResponse:

It can be sent by either the client or the server depending on where the function was executed
The id field links the response to the original request
The name field identifies which function was called
The content field contains the function result, often in JSON format

Implementation Tips

When migrating your function calling implementation:

Update your client code to check the client_side flag
Only respond to functions where client_side is true
Use the id field to track which request you’re responding to
Include both the function name and content in your response
Expect the server to send you FunctionCallResponse messages for functions with client_side: false

Migration Checklist

✅ Update WebSocket endpoint URL
✅ Update configuration message format
- Rename SettingsConfiguration to Settings
- Add provider objects for listen, think, and speak
- Change instructions to prompt
- Use specific model identifiers
✅ Update function calling implementation
- Adapt to new request/response formats
- Implement client_side flag handling
✅ Handle error messages with new format
- Use description instead of message
- Process error codes
✅ Implement support for new message types
- Handle PromptUpdated and SpeakUpdated confirmations
- Process Warning messages
✅ Update the InjectAgentMessage format
- Change message field to content
✅ Handle welcome messages with request_id instead of session_id

Common Migration Issues

Configuration not accepted: Make sure you’ve updated to the provider-based structure for capabilities
Function calls not working: Update both the request and response formats to match V1 specifications
Error handling failures: Update error handling to use description instead of message
Cannot inject messages: Use content instead of message in InjectAgentMessage
Missing confirmation messages: Implement handlers for the new confirmation message types

New Capabilities in V1

Multi-provider support: Configure different providers for listen, think, and speak capabilities
Greeting messages: Set an initial greeting via the greeting field
Improved error handling: Structured errors with codes for better diagnostics
Client-side function execution: Control whether functions run client-side or server-side
Warnings: Non-fatal issues are now reported via Warning messages

Endpoint Changes

Message Type Changes

Removed Message Types

New Message Types

Welcome Message Changes

Early Access: Welcome

V1: Welcome

SettingsConfiguration Becomes Settings

Early Access: SettingsConfiguration

V1: Settings

UpdateSpeak Changes

Early Access: UpdateSpeak

V1: UpdateSpeak

InjectAgentMessage Changes

Early Access: InjectAgentMessage

V1: InjectAgentMessage