Live Streaming Audio Transcription
The Deepgram Live Client
has a WebSocket API
package that allows you to request transcripts for real-time streaming audio. To request a transcript for a live streaming audio, you'll use one of the following functions depending on your audio source:
Live Transcription Parameters
The input parameters for Live/Streaming transcription consist of two parts:
- What options do you want to transcribe the live audio stream
- Obtaining notifications for transcription events
Parameter | Type | Description |
---|---|---|
ctx | Context | Go Context |
options | Object | Parameters to filter requests. See below. |
callback | Object | Provides asynchronous event notifications from the Deepgram Platform |
Options are provided via the LiveTranscriptionOptions struct to be provided in the NewWebSocket
function. Each of these parameters maps to a feature in the Deepgram API. Reference the features documentation to learn the appropriate features for your request.
The Callback Interface defines what you want to do with the transcription as it happens in real-time. You can receive notifications of two events: Message/Transcriptions and Metadata.
Initiating a Connection
Creating a Live Client can be done by using the following code:
// options
transcriptOptions := interfaces.LiveTranscriptionOptions{
Language: "en-US",
Punctuate: true,
Encoding: "linear16",
Channels: 1,
Sample_rate: 16000,
}
// create a callback for transcription messages
// for example, you can take a look at this example project:
// https://github.com/deepgram/deepgram-go-sdk/blob/main/examples/streaming/microphone/main.go
// create the client
dgClient, err := client.NewWebSocketWithDefaults(ctx, transcriptOptions, callback)
if err != nil {
log.Println("ERROR creating LiveTranscription connection:", err)
return
}
// call connect!
bConencted := dgClient.Connect()
if !bConencted {
log.Println("Client.Connect failed")
os.Exit(1)
}
Define Options for the Client
ClientOptions
defines any options for the client. When creating a new Deepgram LiveTranscription
client, pass in the optional config options.
A common config option would be to enable the KeepAlive option with EnableKeepAlive
:
ctx := context.Background()
apiKey := "DEEPGRAM_API_KEY"
clientOptions := interfaces.ClientOptions{
EnableKeepAlive: true, // Enable KeepAlive option
}
transcriptOptions := interfaces.LiveTranscriptionOptions{
Language: "en-US",
Model: "nova-2",
SmartFormat: true,
}
// Implement your own callback
callback := MyCallback{}
// Create a new Deepgram LiveTranscription client with config options
dgClient, err := client.NewWebSocket(ctx, apiKey, clientOptions, transcriptOptions, callback)
if err != nil {
fmt.Println("ERROR creating LiveTranscription connection:", err)
return
}
Read more about KeepAlive in this comprehensive guide.
Events
The live transcription client fires the following events:
Event | Description | Data |
---|---|---|
Message | Transcription event - contains transcribed audio | MessageResponse |
Metadata | Metadata event - these are usually information describing the connection | MetadataResponse |
Implementing an Event Callback Listener
To receive transcription events, the LiveMessageCallback interface needs to be implemented as defined below:
// LiveMessageCallback is a callback used to receive notifcations for platforms messages
type LiveMessageCallback interface {
Open(or *OpenResponse) error
Message(mr *MessageResponse) error
Metadata(md *MetadataResponse) error
SpeechStarted(ssr *SpeechStartedResponse) error
UtteranceEnd(ur *UtteranceEndResponse) error
Close(cr *CloseResponse) error
Error(er *ErrorResponse) error
UnhandledEvent(byData []byte) error
}
Sending Data/Streams
Go is a very stream-oriented language. Many helpful interfaces are provided natively and, if utilized correctly, dramatically simplify the implementation of your client. The Live Client implements the io.Writer interface, which means one can read data from your audio source and push it into the Deepgram Client.
What this ends up looking like on the client side is:
go func() {
// this is a blocking call
mic.Stream(dgClient)
}()
Closing the Connection
When finished with the Live stream, you can safely close the stream by calling the following function on the client.
deepgramLive.finish();
Where To Find Additional Examples
The repository has a good collection of live audio transcription examples. You can find links to them in the README attempts to provide different options for trancribing. Each example below will attempt to provide different options on how you might transcribe a live-streaming source.
- From a Microphone - examples/streaming/microphone
- From a HTTP Endpoint - examples/streaming/http
Updated 5 months ago