Live Streaming Audio Transcription
An overview of the Deepgram Python SDK and Deepgram speech-to-text live streaming.
Declaring a Deepgram Websocket
The listen.live
class creates a websocket connection to the Deepgram API.
# Configure the DeepgramClientOptions to enable KeepAlive for maintaining the WebSocket connection (only if necessary to your scenario)
config = DeepgramClientOptions(
options={"keepalive": "true"}
)
# Create a websocket connection using the DEEPGRAM_API_KEY from environment variables
deepgram = DeepgramClient(API_KEY, config)
# Use the listen.live class to create the websocket connection
dg_connection = deepgram.listen.live.v("1")
If your scenario requires you to keep the connection alive even while data is not being sent to Deepgram, you can send periodic KeepAlive messages to essentially "pause" the connection without closing it. By setting "keepalive": "true"
in the DeepgramClientOptions
object, you enable KeepAlive to maintain the WebSocket connection, ensuring a more stable and persistent connection with Deepgram's servers.
Read more about KeepAlive in this comprehensive guide
Parameters
Additional options can be provided for streaming transcriptions when the websocket start()
function is called. They are provided by delcaring a LiveOptions
object. Each of these parameters map to a feature in the Deepgram API. Reference the features documentation to learn what features may be appropriate for your request.
Declaring a Deepgram Websocket
The listen.live
function declares a websocket object to the Deepgram API.
# Create a websocket connection using the DEEPGRAM_API_KEY from environment variables
dg_connection = deepgram.listen.live.v("1")
Events and Callbacks
The following events are fired by the live transcription object:
Event | Description | Data |
---|---|---|
Metadata | Metadata (or information) regarding the websocket connection | Metadata object |
Error | An error occurred with the websocket connection | Error object |
Results | Deepgram has responded with a transcription | Transcription Response |
Listening to Events
Use the on
function to listen for events fired by the websocket object.
Listen for any transcripts to be received and receive a callback to your declared function called on_message
.
def on_message(self, result, **kwargs):
sentence = result.channel.alternatives[0].transcript
if len(sentence) == 0:
return
print(f"Transcription: {sentence}")
dg_connection.on(LiveTranscriptionEvents.Transcript, on_message)
Listen for any errors/exceptions and receive a callback to your declared function called on_error
def on_error(self, error, **kwargs):
print(f"Error: {error}")
dg_connection.on(LiveTranscriptionEvents.Error, on_error)
Connecting Your Websocket
After you have declared your callbacks, declare the LiveOptions
or the transcription parameters you want to use for your websocket connection. These options are passed into the start()
function, which will subsequently connect the websocket to the Deepgram API.
Please note that options can be declared in LiveOptions
, such as interim_results
, language
, etc can be found XXX are passed in as key/value pairs.
options = LiveOptions(
punctuate=True,
interim_results=False,
language='en-GB'
)
dg_connection.start(options)
Functions
There are several functions that the Deepgram websocket class provides to make using the Deepgram API easier. The most notable ones are send
and finish
.
Sending Audio Stream Bytes
The send
function sends raw audio data to the Deepgram API.
dg_connection.send(SOME_STREAMING_DATA)
When transcription results are available, you will receive those messages via the callback function previously mentioned (as seen below).
def on_message(self, result, **kwargs):
sentence = result.channel.alternatives[0].transcript
if len(sentence) == 0:
return
print(f"Transcription: {sentence}")
dg_connection.on(LiveTranscriptionEvents.Transcript, on_message)
Closing the Connection
The finish
function closes the Websocket connection to Deepgram.
dg_connection.finish()
Where To Find Additional Examples
The repository has a good collection of live audio transcription examples. You can find links to them in the README. Each example below will attempt to provide different options on how you might transcribe a live-streaming source.
- From a Microphone - examples/streaming/microphone
- From a HTTP Endpoint - examples/streaming/http
Updated 21 days ago