Live Streaming Audio Quickstart
A quick introduction to getting transcription data from live streaming audio in real time using a web application, Deepgram's API, and Deepgram SDKs.
¿Prefieres español? Ver Tutorial del API endpoint para transmisión de audio.
Notebook
Prefer the workflow of a Python notebook? Download our Python starter notebook and be up and running quickly without having to copy or paste any code.
In this guide, you'll learn how to automatically transcribe live streaming audio in real time using Deepgram's SDKs, which are supported for use with the Deepgram API.
Before You Begin
Before you run the code, you'll need to do a few things.
Create a Deepgram Account
Before you can use Deepgram products, you'll need to create a Deepgram account. Signup is free and includes:
- $200 in credit, which gives you access to:
- all base models
- pre-recorded and live streaming functionality
- all features
Create a Deepgram API Key
To access Deepgram’s API, you'll need to create a Deepgram API Key. Make note of your API Key; you will need it later.
Configure Environment
We provide sample scripts in Python and Node.js and assume you have already configured either a Python or Node development environment. System requirements will vary depending on the programming language you use:
- Node.js: node >= 14.14.37, cross-fetch >= 3.1.5
- Python: python >= 3.7, aiohttp >= 3.8.1
If you get stuck at any point, help is just a click away. Contact.
Transcribe Audio
Once you have your API Key, it's time to transcribe audio!
If you want a quick way to get up and running, our Python streaming test suite provides sample code to help you get started. Before beginning to build your own integration, we recommend running the test suite code at least once to ensure you can successfully stream sample audio to Deepgram.
If you'd like to follow a step-by-step tutorial, our instructions below will guide you through the process of creating a sample application, installing the Deepgram SDK, configuring code with your own Deepgram API Key and streaming audio to transcribe, and finally, building and running the application.
Install the SDK
Open your terminal, navigate to the location on your drive where you want to create your project, and install the Deepgram SDK:
# Install the Deepgram Python SDK
# https://github.com/deepgram/deepgram-python-sdk
pip install deepgram-sdk
# Initialize a new application
npm init
# Install the Deepgram Node.js SDK
# https://github.com/deepgram/node-sdk
npm install @deepgram/sdk
Add Dependencies
Add necessary external dependencies to your project.
# Install aiohttp: HTTP client/server for asyncio that allows you to write asynchronous clients and servers, and supports WebSockets.
pip install aiohttp
# Install cross-fetch: Platform-agnostic Fetch API with typescript support, a simple interface, and optional polyfill.
npm install cross-fetch
Write the Code
In your terminal, create a new file in your project's location, and populate it with code.
The following example includes the parameter
model=nova
, which tells the API to use Deepgram's most powerful and affordable model. Removing this parameter will result in the API using the default model, which is currentlymodel=general
.
# Example filename: deepgram_test.py
from deepgram import Deepgram
import asyncio
import aiohttp
# Your Deepgram API Key
DEEPGRAM_API_KEY = 'YOUR_DEEPGRAM_API_KEY'
# URL for the realtime streaming audio you would like to transcribe
# The URL for the sample resource changes depending on whether the user is outside or inside the UK
# Outside the UK
URL = 'http://stream.live.vc.bbcmedia.co.uk/bbc_radio_fourlw_online_nonuk'
# Inside the UK
# URL = 'http://stream.live.vc.bbcmedia.co.uk/bbc_radio_fourfm'
async def main():
# Initialize the Deepgram SDK
deepgram = Deepgram(DEEPGRAM_API_KEY)
# Create a websocket connection to Deepgram
# In this example, punctuation is turned on, interim results are turned off, and language is set to UK English.
try:
deepgramLive = await deepgram.transcription.live({
'punctuate': True,
'interim_results': False,
'language': 'en-US',
'model': 'nova',
})
except Exception as e:
print(f'Could not open socket: {e}')
return
# Listen for the connection to close
deepgramLive.registerHandler(deepgramLive.event.CLOSE, lambda c: print(f'Connection closed with code {c}.'))
# Listen for any transcripts received from Deepgram and write them to the console
deepgramLive.registerHandler(deepgramLive.event.TRANSCRIPT_RECEIVED, print)
# Listen for the connection to open and send streaming audio from the URL to Deepgram
async with aiohttp.ClientSession() as session:
async with session.get(URL) as audio:
while True:
data = await audio.content.readany()
deepgramLive.send(data)
# If no data is being sent from the live stream, then break out of the loop.
if not data:
break
# Indicate that we've finished sending data by sending the customary zero-byte message to the Deepgram streaming endpoint, and wait until we get back the final summary metadata object
await deepgramLive.finish()
# If running in a Jupyter notebook, Jupyter is already running an event loop, so run main with this line instead:
#await main()
asyncio.run(main())
// Example filename: index.js
const { Deepgram } = require("@deepgram/sdk");
const fetch = require("cross-fetch");
// Your Deepgram API Key
const deepgramApiKey = "YOUR_DEEPGRAM_API_KEY";
// URL for the audio you would like to stream
// URL for the example resource will change depending on whether user is outside or inside the UK
// Outside the UK
const url = "http://stream.live.vc.bbcmedia.co.uk/bbc_radio_fourlw_online_nonuk";
// Inside the UK
// const url = 'http://stream.live.vc.bbcmedia.co.uk/bbc_radio_fourfm';
// Initialize the Deepgram SDK
const deepgram = new Deepgram(deepgramApiKey);
// Create a websocket connection to Deepgram
// In this example, punctuation is turned on, interim results are turned off, and language is set to UK English.
const deepgramLive = deepgram.transcription.live({
punctuate: true,
interim_results: false,
language: "en-US",
model: "nova",
});
// Listen for the connection to open and send streaming audio from the URL to Deepgram
fetch(url)
.then((r) => r.body)
.then((res) => {
res.on("readable", () => {
if (deepgramLive.getReadyState() == 1) {
deepgramLive.send(res.read());
}
});
});
// Listen for the connection to close
deepgramLive.addListener("close", () => {
console.log("Connection closed.");
});
// Listen for any transcripts received from Deepgram and write them to the console
deepgramLive.addListener("transcriptReceived", (message) => {
const data = JSON.parse(message);
// Write the entire response to the console
console.dir(data.channel, { depth: null });
// Write only the transcript to the console
//console.dir(data.channel.alternatives[0].transcript, { depth: null });
});
Be sure to replace
YOUR_DEEPGRAM_API_KEY
with your Deepgram API Key.
Start the Application
Run the application from the terminal.
# Run your application using the file you created in the previous step
# Example: python deepgram_test.py
python YOUR_PROJECT_NAME.py
# Run your application using the file you created in the previous step
# Example: node index.js
node YOUR_PROJECT_NAME.js
Be sure to replace
YOUR_PROJECT_NAME
with the name of the file to which you saved the code in the previous step.
See Results
Your transcripts will appear in your browser's developer console.
Deepgram does not store transcriptions. Make sure to save output or return transcriptions to a callback URL for custom processing.
By default, Deepgram live streaming looks for any deviation in the natural flow of speech and returns a finalized response at these places. To learn more about this feature, see Endpointing.
Deepgram live streaming can also return a series of interim transcripts followed by a final transcript. To learn more, see Interim Results.
Endpointing can be used with Deepgram's Interim Results feature. To compare and contrast these features, and to explore best practices for using them together, see Using Endpointing and Interim Results with Live Streaming Audio.
What's Next?
Now that you've gotten transcripts for streaming audio, enhance your knowledge by exploring the following areas.
Customize Transcripts
To customize the transcripts you receive, you can send a variety of parameters to the Deepgram API.
For example, if your audio is in Spanish rather than UK English, you can pass the language:
parameter with the es
option to the transcription.live
method in the previous examples.
Not all languages work with all available models. Be sure to check out the Languages page to see which models are compatible with which languages.
deepgramLive = await deepgram.transcription.live({
'punctuate': True,
'interim_results': False,
'language': 'es'
})
const deepgramLive = deepgram.transcription.live({
punctuate: true,
interim_results: false,
language: "es",
});
To learn more about the languages available with Deepgram, see the Language feature guide. To learn more about the many ways you can customize your results with Deepgram's API, check out the Deepgram API Reference.
Add Your Audio
Ready to connect Deepgram to your own audio source? Start by reviewing how to determine your audio format and format your API request accordingly.
Then, you'll want to check out our streaming test suite. The streaming test suite is the perfect "102" introduction to integrating your own audio.
Explore Use Cases
Time to learn about the different ways you can use Deepgram products to help you meet your business objectives. Explore Deepgram's use cases.
Transcribe Pre-recorded Audio
Now that you know how to transcribe streaming audio, check out how you can use Deepgram to transcribe pre-recorded audio. To learn more, see Getting Started with Pre-recorded Audio.
Updated 12 days ago