Stream Audio to Deepgram

Last updated 08/03/2021

In this guide, you'll learn how to automatically transcribe streaming audio using Deepgram's off-the-shelf general purpose AI model. To learn more about additional available models, see Deepgram API Reference: Query Parameters - model.

The examples in this guide use Python and Node scripts rather than Deepgram SDKs. To learn how to stream real-time audio to Deepgram using our SDKs, see Quickstart: Get Started with Streaming Audio.

Before You Begin

Before you can use Deepgram products, you'll need to create a Deepgram account. Signup is free and includes:

  • $150 in credit, which gives you access to:
    • all base models
    • pre-recorded and streaming functionality
    • all features

To access Deepgram’s API, you'll need to create a Deepgram API Key. Make note of your API Key; you will need it later.

In this guide, we use an audio recording of an interview with Scott Stephenson, Deepgram’s CEO. If you would like to follow along with the examples using this audio file, you can download it.

We provide sample scripts in Node.js and Python format and assume you have already configured either a Node or Python (3.6 or greater) development environment.

Real-time streaming uses WebSockets, a communications protocol that enables full-duplex communication, which means that you can stream new audio to Deepgram at the same time the latest transcription results are streaming back to you. Using WebSockets is further eased by the wide variety of third-party client libraries that have been written to support a range of languages and production environments.

For Node.js, we use ws.

For Python, we use websockets. Additional dependencies for Python include scipy.

Configure the Script

In this guide, we stream an audio file to Deepgram’s API. To recreate the transcripts we use in this guide, you can download either a Node.js or Python version of the script.

  1. Download the Python version of our streaming script, and open it in your favorite editor. You will see the following:
import asyncio
import base64
import json
import sys
import websockets
import scipy.io.wavfile
import time

async def run():
   # Make sure your audio file is in the same directory as this script.
   with open('interview_speech-analytics.wav', 'rb') as fh:
       data = fh.read()
   # Replace with your Deepgram API key.
   extra_headers = {
       'Authorization': 'Token YOUR_DEEPGRAM_API_KEY'
   }
   # Include the real-time streaming endpoint for the Deepgram API.
   async with websockets.connect('wss://api.deepgram.com/v1/listen?endpointing=false', extra_headers=extra_headers) as ws:
       async def sender(ws):
           try:
               nonlocal data
               total = len(data)
               cur = 0
               while data:
                   chunk, data = data[:10000], data[10000:]
                   await ws.send(chunk)
                   cur += len(chunk)
               await ws.send(b'')
           except Exception as e:
               print('Error while sending:')
               print(e)
               raise
       async def receiver(ws):
           async for msg in ws:
               print(msg)
       await asyncio.wait([
           asyncio.ensure_future(sender(ws)),
           asyncio.ensure_future(receiver(ws))
       ])
def main():
   loop = asyncio.get_event_loop()
   asyncio.get_event_loop().run_until_complete(run())
if __name__ == '__main__':
   sys.exit(main() or 0)

Be sure to replace the placeholder YOUR_DEEPGRAM_API_KEY with the API Key you created earlier in this tutorial.

  1. Ensure the specified audio file is in the same directory as the example script.
  1. Download the Node.js version of our streaming script, and open it in your favorite editor. You will see the following:
const WebSocket = require('ws');
// Include the real-time streaming endpoint for the Deepgram API.
const ws = new WebSocket('wss://api.deepgram.com/v1/listen?endpointing=false', {
// Replace with your Deepgram project's API Key.
  headers: {
    Authorization: 'Token YOUR_DEEPGRAM_API_KEY',
  },
});
ws.on('open', function open() {
  console.log('Will send audio file');
  let fs = require('fs');
  // Audio file to stream. Make sure it is in the same directory as this script.
  reader = fs.createReadStream('interview_speech-analytics.wav');
  reader.on('data', function (chunk) {
    ws.send(Buffer.from(chunk));
  });
});
ws.on('message', function incoming(data) {
  console.log(data);
});

Be sure to replace the placeholder YOUR_DEEPGRAM_API_KEY with the API Key you created earlier in this tutorial.

  1. Ensure the specified audio file is in the same directory as the example script.

Run the Script

To run the script, use one of the following commands:

$ python3 deepgram-streaming-example.py
$ node deepgram-streaming-example.js

When run, the script sends the audio to Deepgram's real-time streaming endpoint and prints the output to the screen:

{"channel_index":[0,1],"duration":1.039875,"start":0.0,"is_final":false,"channel":{"alternatives":[{"transcript":"another big","confidence":0.9600255,"words":[{"word":"another","start":0.2971154,"end":0.7971154,"confidence":0.9588303},{"word":"big","start":0.85173076,"end":1.039875,"confidence":0.9600255}]}]},"metadata":{"request_id":"NkhdNwwDJWBlOvhcFlutX1g8iqmZSLz6"}}
{"channel_index":[0,1],"duration":2.039875,"start":0.0,"is_final":false,"channel":{"alternatives":[{"transcript":"another big problem","confidence":0.9939844,"words":[{"word":"another","start":0.29852942,"end":0.7985294,"confidence":0.9939844},{"word":"big","start":0.8557843,"end":1.3557843,"confidence":0.98220366},{"word":"problem","start":1.5722549,"end":2.039875,"confidence":0.9953441}]}]},"metadata":{"request_id":"NkhdNwwDJWBlOvhcFlutX1g8iqmZSLz6"}}
...