Audio Keep Alive

Learn how to send messages while streaming audio, ensuring uninterrupted communication.

KeepAlive serves as a crucial mechanism for maintaining an uninterrupted connection with Deepgram's servers, allowing you to optimize your audio streaming experience while minimizing costs.

What is theKeepAlive message?

A common situation is needing to keep a connection open without constantly sending audio. Normally, you'd have to send data all the time, even silence, which wastes resources and increases costs since Deepgram charges for all audio, whether it’s speech or silence. KeepAlive solves this by allowing you to pause the connection and resume later, avoiding extra costs for transcribing silence.

Benefits

  • Cost Efficiency: KeepAlive enables you to optimize costs by pausing the connection during periods of silence, eliminating the need to transcribe unnecessary audio data.
  • Connection Maintenance: By sending periodic KeepAlive messages, you can ensure that the WebSocket connection remains open, preventing timeouts and maintaining communication with Deepgram's servers.
  • Flexibility: KeepAlive offers flexibility in managing your streaming sessions. You can temporarily halt the transmission of audio data while keeping the connection active, resuming data streaming when needed without re-establishing the connection.

Sending KeepAlive

To send the KeepAlive message, you need to send the following JSON message to the server:

{
  "type": "KeepAlive" 
}

Because Deepgram's streaming connection is set to time out after 10 seconds of inactivity, it's essential to periodically send KeepAlive messages to maintain the connection and prevent it from closing prematurely.

If no audio data is sent within a 10-second window, the connection will close, triggering a NET-0001 error. Using KeepAlive extends the connection by another 10 seconds. To avoid this error and keep the connection open, continue sending KeepAlive messages 3-5 seconds before the 10 second timeout window expires until you are ready to resume sending audio.

📘

Be sure to send the KeepAlive message as a text WebSocket message. Sending it as a binary message may result in incorrect handling and could lead to connection issues.

KeepAlive Confirmation

You will not receive a response back from the server.

Language Specific Implementations

Following are code examples to help you get started.

Sending a KeepAlive message in JSON Format

These snippets demonstrate how to construct a JSON message containing the KeepAlive type and send it over the WebSocket connection in each respective language.

const WebSocket = require("ws");

// Assuming 'headers' is already defined for authorization
const ws = new WebSocket("wss://api.deepgram.com/v1/listen", { headers });

// Assuming 'ws' is the WebSocket connection object
const keepAliveMsg = JSON.stringify({ type: "KeepAlive" });
ws.send(keepAliveMsg);

import json
import websocket

# Assuming 'headers' is already defined for authorization
ws = websocket.create_connection("wss://api.deepgram.com/v1/listen", header=headers)

# Assuming 'ws' is the WebSocket connection object
keep_alive_msg = json.dumps({"type": "KeepAlive"})
ws.send(keep_alive_msg)

package main

import (
    "encoding/json"
    "log"
    "net/http"
    "github.com/gorilla/websocket"
)

func main() {
    // Define headers for authorization
    headers := http.Header{}
    
  	// Assuming headers are set here for authorization
    conn, _, err := websocket.DefaultDialer.Dial("wss://api.deepgram.com/v1/listen", headers)
    if err != nil {
        log.Fatal("Error connecting to WebSocket:", err)
    }
    defer conn.Close()

    // Construct KeepAlive message
    keepAliveMsg := map[string]string{"type": "KeepAlive"}
    jsonMsg, err := json.Marshal(keepAliveMsg)
    if err != nil {
        log.Fatal("Error encoding JSON:", err)
    }

    // Send KeepAlive message
    err = conn.WriteMessage(websocket.TextMessage, jsonMsg)
    if err != nil {
        log.Fatal("Error sending KeepAlive message:", err)
    }
}

using System;
using System.Net.WebSockets;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

class Program
{
    static async Task Main(string[] args)
    {
        // Set up the WebSocket URL and headers
        Uri uri = new Uri("wss://api.deepgram.com/v1/listen");
        
        string apiKey = "DEEPGRAM_API_KEY";

        // Create a new client WebSocket instance
        using (ClientWebSocket ws = new ClientWebSocket())
        {
            // Set the authorization header
            ws.Options.SetRequestHeader("Authorization", "Token " + apiKey);

            // Connect to the WebSocket server
            await ws.ConnectAsync(uri, CancellationToken.None);

            // Construct the KeepAlive message
            string keepAliveMsg = "{\"type\": \"KeepAlive\"}";

            // Convert the KeepAlive message to a byte array
            byte[] keepAliveBytes = Encoding.UTF8.GetBytes(keepAliveMsg);

            // Send the KeepAlive message asynchronously
            await ws.SendAsync(new ArraySegment<byte>(keepAliveBytes), WebSocketMessageType.Text, true, CancellationToken.None);
        }
    }
}

Streaming Examples

Here are more complete examples that make a streaming request and use KeepAlive. Try running these examples to see how KeepAlive is sent periodically.

const WebSocket = require("ws");

const authToken = "DEEPGRAM_API_KEY"; // Replace 'DEEPGRAM_API_KEY' with your actual authorization token
const headers = {
  Authorization: `Token ${authToken}`,
};

// Initialize WebSocket connection
const ws = new WebSocket("wss://api.deepgram.com/v1/listen", { headers });

// Handle WebSocket connection open event
ws.on("open", function open() {
  console.log("WebSocket connection established.");

  // Send audio data (replace this with your audio streaming logic)
  // Example: Read audio from a microphone and send it over the WebSocket
  // For demonstration purposes, we're just sending a KeepAlive message

  setInterval(() => {
    const keepAliveMsg = JSON.stringify({ type: "KeepAlive" });
    ws.send(keepAliveMsg);
    console.log("Sent KeepAlive message");
  }, 3000); // Sending KeepAlive messages every 3 seconds
});

// Handle WebSocket message event
ws.on("message", function incoming(data) {
  console.log("Received:", data);
  // Handle received data (transcription results, errors, etc.)
});

// Handle WebSocket close event
ws.on("close", function close() {
  console.log("WebSocket connection closed.");
});

// Handle WebSocket error event
ws.on("error", function error(err) {
  console.error("WebSocket error:", err.message);
});

// Gracefully close the WebSocket connection when done
function closeWebSocket() {
  const closeMsg = JSON.stringify({ type: "CloseStream" });
  ws.send(closeMsg);
}

// Call closeWebSocket function when you're finished streaming audio
// For example, when user stops recording or when the application exits
// closeWebSocket();

import websocket
import json
import time
import threading

auth_token = "DEEPGRAM_API_KEY"  # Replace 'DEEPGRAM_API_KEY' with your actual authorization token
headers = {
    "Authorization": f"Token {auth_token}"
}

# WebSocket URL
ws_url = "wss://api.deepgram.com/v1/listen"

# Define the WebSocket on_open function
def on_open(ws):
    print("WebSocket connection established.")
    # Send KeepAlive messages every 3 seconds
    def keep_alive():
        while True:
            keep_alive_msg = json.dumps({"type": "KeepAlive"})
            ws.send(keep_alive_msg)
            print("Sent KeepAlive message")
            time.sleep(3)
    # Start a thread for sending KeepAlive messages
    keep_alive_thread = threading.Thread(target=keep_alive)
    keep_alive_thread.daemon = True
    keep_alive_thread.start()

# Define the WebSocket on_message function
def on_message(ws, message):
    print("Received:", message)
    # Handle received data (transcription results, errors, etc.)

# Define the WebSocket on_close function
def on_close(ws):
    print("WebSocket connection closed.")

# Define the WebSocket on_error function
def on_error(ws, error):
    print("WebSocket error:", error)

# Create WebSocket connection
ws = websocket.WebSocketApp(ws_url,
                            on_open=on_open,
                            on_message=on_message,
                            on_close=on_close,
                            on_error=on_error,
                            header=headers)

# Run the WebSocket
ws.run_forever()

Deepgram SDKs

Deepgram's SDKs make it easier to build with Deepgram in your preferred language. To learn more about getting started with the SDKs, visit Deepgram's SDKs documentation.

const { createClient, LiveTranscriptionEvents } = require("@deepgram/sdk");

const live = async () => {
  const deepgram = createClient("DEEPGRAM_API_KEY");
  let connection;
  let keepAlive;

  const setupDeepgram = () => {
    connection = deepgram.listen.live({
      model: "nova-2",
      utterance_end_ms: 1500,
      interim_results: true,
    });

    if (keepAlive) clearInterval(keepAlive);
    keepAlive = setInterval(() => {
      console.log("KeepAlive sent.");
      connection.keepAlive();
    }, 3000); // Sending KeepAlive messages every 3 seconds

    connection.on(LiveTranscriptionEvents.Open, () => {
      console.log("Connection opened.");
    });

    connection.on(LiveTranscriptionEvents.Close, () => {
      console.log("Connection closed.");
      clearInterval(keepAlive);
    });

    connection.on(LiveTranscriptionEvents.Metadata, (data) => {
      console.log(data);
    });

    connection.on(LiveTranscriptionEvents.Transcript, (data) => {
      console.log(data.channel);
    });

    connection.on(LiveTranscriptionEvents.UtteranceEnd, (data) => {
      console.log(data);
    });

    connection.on(LiveTranscriptionEvents.SpeechStarted, (data) => {
      console.log(data);
    });

    connection.on(LiveTranscriptionEvents.Error, (err) => {
      console.error(err);
    });
  };

  setupDeepgram();
};

live();

from deepgram import DeepgramClient, DeepgramClientOptions, LiveTranscriptionEvents, LiveOptions

API_KEY = "DEEPGRAM_API_KEY"

def main():
    try:
        config = DeepgramClientOptions(
            options={"keepalive": "true"} # Comment this out to see the effect of not using keepalive
        )
        
        deepgram = DeepgramClient(API_KEY, config)

        dg_connection = deepgram.listen.live.v("1")

        def on_message(self, result, **kwargs):
            sentence = result.channel.alternatives[0].transcript
            if len(sentence) == 0:
                return
            print(f"speaker: {sentence}")

        def on_metadata(self, result, **kwargs):
            print(f"\n\n{result}\n\n")

        def on_error(self, error, **kwargs):
            print(f"\n\n{error}\n\n")

        dg_connection.on(LiveTranscriptionEvents.Transcript, on_message)
        dg_connection.on(LiveTranscriptionEvents.Metadata, on_metadata)
        dg_connection.on(LiveTranscriptionEvents.Error, on_error)

        options = LiveOptions(
            model="nova-2", 
            language="en-US", 
            smart_format=True,
        )
        
        dg_connection.start(options)

    except Exception as e:
        print(f"Could not open socket: {e}")

if __name__ == "__main__":
    main()

package main

import (
	"bufio"
	"context"
	"fmt"
	"os"

	interfaces "github.com/deepgram/deepgram-go-sdk/pkg/client/interfaces"
	client "github.com/deepgram/deepgram-go-sdk/pkg/client/live"
)

func main() {
	// init library
	client.InitWithDefault()

	// Go context
	ctx := context.Background()

	// set the Transcription options
	tOptions := interfaces.LiveTranscriptionOptions{
		Language:  "en-US",
		Punctuate: true,
	}

	// create a Deepgram client
	cOptions := interfaces.ClientOptions{
		EnableKeepAlive: true, // Comment this out to see the effect of not using keepalive
	}

	// use the default callback handler which just dumps all messages to the screen
	dgClient, err := client.New(ctx, "", cOptions, tOptions, nil)
	if err != nil {
		fmt.Println("ERROR creating LiveClient connection:", err)
		return
	}

	// connect the websocket to Deepgram
	wsconn := dgClient.Connect()
	if wsconn == nil {
		fmt.Println("Client.Connect failed")
		os.Exit(1)
	}

	// wait for user input to exit
	fmt.Printf("This demonstrates using KeepAlives. Press ENTER to exit...\n")
	input := bufio.NewScanner(os.Stdin)
	input.Scan()

	// close client
	dgClient.Stop()

	fmt.Printf("Program exiting...\n")
}

Word Timings

Word timings for transcription results returned from a stream are based on the audio sent, not the lifetime of the websocket. If you send KeepAlive messages without sending audio payloads for a period of time, then resume sending audio payloads, the timestamps for transcription results will pick up where they left off when you paused sending audio payloads.

Here is an example timeline demonstrating the behavior:

EventWall TimeWord Timing Range on Results Response
Websocket opened, begin sending audio payloads0 seconds0 seconds
Results received5 seconds0-5 seconds
Results received10 seconds5-10 seconds
Pause sending audio payloads, while sending KeepAlive messages10 secondsn/a
Resume sending audio payloads30 secondsn/a
Results received35 seconds10-15 seconds