Live Streaming Audio Transcription

The Deepgram Live Client has a WebSocket API package that allows you to request transcripts for real-time streaming audio. To request a transcript for a live streaming audio, you'll use one of the following functions depending on your audio source:

Live Transcription Parameters

The input parameters for Live/Streaming transcription consist of two parts:

  • What options do you want to transcribe the live audio stream
  • Obtaining notifications for transcription events
ParameterTypeDescription
ctxContextGo Context
optionsObjectParameters to filter requests. See below.
callbackObjectProvides asynchronous event notifications from the Deepgram Platform

Options are provided via the LiveTranscriptionOptions struct to be provided in the NewWebSocket function. Each of these parameters maps to a feature in the Deepgram API. Reference the features documentation to learn the appropriate features for your request.

The Callback Interface defines what you want to do with the transcription as it happens in real-time. You can receive notifications of two events: Message/Transcriptions and Metadata.

Initiating a Connection

Creating a Live Client can be done by using the following code:

// options
transcriptOptions := interfaces.LiveTranscriptionOptions{
	Language:    "en-US",
	Punctuate:   true,
	Encoding:    "linear16",
	Channels:    1,
	Sample_rate: 16000,
}

// create a callback for transcription messages
// for example, you can take a look at this example project:
// https://github.com/deepgram/deepgram-go-sdk/blob/main/examples/streaming/microphone/main.go

// create the client
dgClient, err := client.NewWebSocketWithDefaults(ctx, transcriptOptions, callback)
if err != nil {
	log.Println("ERROR creating LiveTranscription connection:", err)
	return
}

// call connect!
bConencted := dgClient.Connect()
if !bConencted {
	log.Println("Client.Connect failed")
	os.Exit(1)
}

Define Options for the Client

ClientOptions defines any options for the client. When creating a new Deepgram LiveTranscription client, pass in the optional config options.

A common config option would be to enable the KeepAlive option with EnableKeepAlive:

ctx := context.Background()
	apiKey := "DEEPGRAM_API_KEY"
	clientOptions := interfaces.ClientOptions{
		EnableKeepAlive: true, // Enable KeepAlive option
	}

	transcriptOptions := interfaces.LiveTranscriptionOptions{
		Language:    "en-US",
		Model:       "nova-2",
		SmartFormat: true,
	}

	// Implement your own callback
	callback := MyCallback{}

	// Create a new Deepgram LiveTranscription client with config options
	dgClient, err := client.NewWebSocket(ctx, apiKey, clientOptions, transcriptOptions, callback)
	if err != nil {
		fmt.Println("ERROR creating LiveTranscription connection:", err)
		return
	}

📘

Read more about KeepAlive in this comprehensive guide.

Events

The live transcription client fires the following events:

EventDescriptionData
MessageTranscription event - contains transcribed audioMessageResponse
MetadataMetadata event - these are usually information describing the connectionMetadataResponse

Implementing an Event Callback Listener

To receive transcription events, the LiveMessageCallback interface needs to be implemented as defined below:

// LiveMessageCallback is a callback used to receive notifcations for platforms messages
type LiveMessageCallback interface {
	Open(or *OpenResponse) error
	Message(mr *MessageResponse) error
	Metadata(md *MetadataResponse) error
	SpeechStarted(ssr *SpeechStartedResponse) error
	UtteranceEnd(ur *UtteranceEndResponse) error
	Close(cr *CloseResponse) error
	Error(er *ErrorResponse) error
	UnhandledEvent(byData []byte) error
}

Sending Data/Streams

Go is a very stream-oriented language. Many helpful interfaces are provided natively and, if utilized correctly, dramatically simplify the implementation of your client. The Live Client implements the io.Writer interface, which means one can read data from your audio source and push it into the Deepgram Client.

What this ends up looking like on the client side is:

	go func() {
		// this is a blocking call
		mic.Stream(dgClient)
	}()

Closing the Connection

When finished with the Live stream, you can safely close the stream by calling the following function on the client.

deepgramLive.finish();

Where To Find Additional Examples

The repository has a good collection of live audio transcription examples. You can find links to them in the README attempts to provide different options for trancribing. Each example below will attempt to provide different options on how you might transcribe a live-streaming source.