Server Messages
Learn about the different messages you will receive from the Agent server.
This section defines the types of JSON messages that the server can send over the websocket. Every message will contain a type
field identifying the type of message.
Welcome
The server will send a Welcome
message as soon as the websocket opens. The Welcome
message will be sent immediately.
Example message
{
"type": "Welcome",
"session_id": "fc553ec9-5874-49ca-a47c-b670d525a4b1"
}
Setting Applied
The server will send a SettingsApplied
message as confirmation that the server received the SettingsConfiguration
message.
{
"type": "SettingsApplied",
}
Conversation Context and Welcome Messages
Upon connection, we accomplish supplying any previous conversation context and any potential "welcome message" (i.e. when the assistant/agent speaks first) via the same mechanism.
"context": {
"messages": [
{"role": "user", "content": "Hello?"},
{"role": "assistant", "content": "Hello, how may I help you today?"}
],
"replay": true
}
ConversationText
The server will send a ConversationText
message every time the agent hears the user say something, and every time the agent speaks something. These can be used on the client side to display the conversation messages as they happen in real-time.
Example Message
{
"type": "ConversationText",
"role": "", // The speaker of this statement, either "user" or "assistant"
"content": "" // The statement that was spoken
}
UserStartedSpeaking
The server will send a UserStartedSpeaking
message every time the user begins a new utterance. If the client is playing agent audio when this message is received, it should stop playback immediately and discard all of its buffered agent audio.
Example Message
{
"type": "UserStartedSpeaking"
}
AgentThinking
The server will send an AgentThinking
message to inform the client of a non-verbalized agent thought. When functions are available, some LLMs use these thoughts to decide which functions to call.
Example Message
{
"type": "AgentThinking",
"content": "" // The text of the agent's thought
}
FunctionCallRequest
If a function is client-side and no URL is provided for that function in the Settings Configuration then the server will request to call the function by sending the client a FunctionCallRequest message. Upon receiving this message, the client should perform the requested function call and reply with a Function Call Response containing the function's output.
{
"type": "FunctionCallRequest",
"function_name": "", // The `name` you gave in the function definition
"function_call_id": "", // ID to be passed back in the `FunctionCallResponse`
"input": {...} // A JSON value containing the `parameters` you defined for this function
}
FunctionCalling
The server will sometimes send FunctionCalling
messages when making function calls to help the client developer debug function calling workflows. The format of this message, and whether it is sent at all, depends on the LLM provider being used.
Example Message
{
"type": "FunctionCalling",
...
}
AgentStartedSpeaking
The server will send an AgentStartedSpeaking
message when it begins streaming an agent audio response to the client for playback.
Example Message
{
"type": "AgentStartedSpeaking",
"total_latency": 0.0, // Seconds from receiving the user's utterance to producing the agent's reply
"tts_latency": 0.0, // The portion of total latency attributable to text-to-speech
"ttt_latency": 0.0 // The portion of total latency attributable to text-to-text (usually an LLM)
}
AgentAudioDone
The server will send an AgentAudioDone
message immediately after it sends the last audio message. In some cases even though no more audio messages are being received, the agent may still be speaking, because previously sent audio may still be queued for playback.
Example Message
{
"type": "AgentAudioDone"
}
Error
The server will send an Error
message to notify the client that something has gone wrong.
Example Message
{
"type": "Error",
"message": "" // A description of what went wrong
}
Updated about 5 hours ago