Live Streaming Audio Transcription
The transcription.live
function encapsulates a websocket connection to the Deepgram API and returns a Node.js event emitter. Events received from the Deepgram API are emitted for consumption by your application.
Parameters
Additional options can be provided for streaming transcriptions. They are provided as an object of the transcription.live
function. Each of these parameters map to a feature in the Deepgram API. Reference the
features documentation to learn what features may be appropriate for your request.
Initiating a Connection
The transcription.live
function will return a Node.js event emitter that your application can subscribe to events on.
const deepgramLive = deepgram.transcription.live({
punctuate: true,
// additional options
});
Events
The following events are fired by the live transcription object:
Event | Description | Data |
---|---|---|
open | The websocket connection to Deepgram has been opened. | The DG live transcription object |
close | The websocket connection to Deepgram has been closed. | WebSocket.CloseEvent |
error | An error occurred with the websocket connection | Error object |
transcriptReceived | Deepgram has responded with a transcription | Transcription response |
Listening to Events
Use the addListener
function to listen for events fired on the object returned from the transcription.live
function.
deepgramLive.addListener("transcriptReceived", (transcription) => {
console.log(transcription.data);
});
Functions
The object returned by the transcription.live
function provides several functions to make using the Deepgram API easier. They are send
, getReadyState
, toggleNumerals
, and finish
.
Sending Data
The send
function sends raw audio data to the Deepgram API.
deepgramLive.send(AUDIO_STREAM_DATA);
Get Ready State
The getReadyState
function returns the ready state of the websocket connection to Deepgram.
const websocketReadyState = deepgramLive.getReadyState();
Toggle Features
The configure
function gives the ability to toggle different features. Currently, only the numerals
feature is supported with plans to add more in the future. The function takes in an object with features to toggle to determine whether or not to enable the feature.
deepgramLive.configure({numerals: true}); // enable numerals
deepgramLive.configure({numerals: false}); // disable numerals
This function is a shorthand way of sending the following data to the websocket connection:
deepgramLive.send(
JSON.stringify({
type: "Configure",
processors: {
numerals: true, // or false
},
})
);
Closing the Connection
The finish
function closes the Websocket connection to Deepgram.
deepgramLive.finish();
Browser
Using the browser compatible SDK is slightly different. You still make the call to .live()
and pass in the options you want applied to the transcription, but some of the events are different. We are abstracting the native WebSocket API.
import { Deepgram } from "@deepgram/sdk/browser";
const deepgram = new Deepgram("YOUR_API_KEY");
navigator.mediaDevices.getUserMedia({ audio: true }).then((stream) => {
const mediaRecorder = new MediaRecorder(stream, {
mimeType: "audio/webm",
});
const deepgramSocket = deepgram.transcription.live({
punctuate: true,
});
deepgramSocket.addEventListener("open", () => {
mediaRecorder.addEventListener("dataavailable", async (event) => {
if (event.data.size > 0 && deepgramSocket.readyState == 1) {
deepgramSocket.send(event.data);
}
});
mediaRecorder.start(1000);
});
deepgramSocket.addEventListener("message", (message) => {
const received = JSON.parse(message.data);
const transcript = received.channel.alternatives[0].transcript;
if (transcript && received.is_final) {
console.log(transcript);
}
});
});
Updated 13 days ago