Stream Audio to Deepgram

Last updated 06/18/2021

In this guide, you'll learn how to automatically transcribe streaming audio using Deepgram's off-the-shelf general purpose AI model. To learn more about additional available models, see Speech Engine API Reference: Query Parameters - model.

Before You Begin

Before you can use Deepgram products, you'll need to create a Deepgram account.

Your account comes preloaded with:

  • 20 audio hours per month of Automatic Speech Recognition
  • Access to 3 of Deepgram’s off-the-shelf Beginner models

In this guide, we use an audio recording of an interview with Scott Stephenson, Deepgram’s CEO. If you would like to follow along with the examples using this audio file, you can download it.

We provide sample scripts in Node.js and Python format and assume you have already configured either a Node or Python (3.6 or greater) development environment.

Real-time streaming uses WebSockets, a communications protocol that enables full-duplex communication, which means that you can stream new audio to Deepgram at the same time the latest transcription results are streaming back to you. Using WebSockets is further eased by the wide variety of third-party client libraries that have been written to support a range of languages and production environments.

For Node.js, we use ws.

For Python, we use websockets. Additional dependencies for Python include scipy.

Configure the Script

In this guide, we stream an audio file to Deepgram’s API. To recreate the transcripts we use in this guide, you can download either a Node.js or Python version of the script.

  1. Download the Python version of our streaming script, and open it in your favorite editor. You will see the following:
import asyncio
import base64
import json
import sys
import websockets
import scipy.io.wavfile
import time
 
async def run():
  # Make sure your audio file is in the same directory as this script.
  with open('interview_speech-analytics.wav', 'rb') as fh:
      data = fh.read()
  # Replace with your Deepgram account username (or email address you used to sign up) and password.
  auth = ('DEEPGRAM_USERNAME', 'DEEPGRAM_PASSWORD')
  extra_headers = {
      'Authorization': 'Basic {}'.format(
          base64.b64encode('{}:{}'.format(*auth).encode('utf-8')).decode('utf-8')
      )
  }
  # Include the real-time streaming endpoint for the Deepgram API
  async with websockets.connect('wss://brain.deepgram.com/v2/listen/stream?endpointing=false', extra_headers=extra_headers) as ws:
      async def sender(ws):
          try:
              nonlocal data
              total = len(data)
              cur = 0
              while data:
                  chunk, data = data[:10000], data[10000:]
                  await ws.send(chunk)
                  cur += len(chunk)
              await ws.send(b'')
          except Exception as e:
              print('Error while sending:')
              print(e)
              raise
      async def receiver(ws):
          async for msg in ws:
              print(msg)
      await asyncio.wait([
          asyncio.ensure_future(sender(ws)),
          asyncio.ensure_future(receiver(ws))
      ])
def main():
  loop = asyncio.get_event_loop()
  asyncio.get_event_loop().run_until_complete(run())
if __name__ == '__main__':
  sys.exit(main() or 0)

  1. Replace the authentication information with your Deepgram username and password.

  2. Ensure the specified audio file is in the same directory as the example script.

  1. Download the Node.js version of our streaming script, and open it in your favorite editor. You will see the following:
const WebSocket = require('ws');
// Include the real-time streaming endpoint for the Deepgram API.
const ws = new WebSocket('wss://brain.deepgram.com/v2/listen/stream?endpointing=false',
// Replace with your Deepgram account’s base64-encoded username:password.
// Your Deepgram username is the email address you used to sign up.
// Remove any appended padding characters (=).
 ['Basic', 'YOUR_BASE64ENCODED_DEEPGRAM_USERNAME:PASSWORD']
);
ws.on('open', function open() {
 console.log('Will send audio file');
 let fs = require('fs');
 // Audio file to stream. Make sure it is in the same directory as this script.
 reader = fs.createReadStream('interview_speech-analytics.wav');
 reader.on('data', function (chunk) {
   ws.send(Buffer.from(chunk));
 });
});
ws.on('message', function incoming(data) {
 console.log(data);
});
  1. Replace the authentication information with your base64-encoded Deepgram username and password.

  2. Ensure the specified audio file is in the same directory as the example script.

Run the Script

To run the script, use one of the following commands:

$ python3 deepgram-streaming-example.py
$ node deepgram-streaming-example.js

When run, the script sends the audio to Deepgram's real-time streaming endpoint and prints the output to the screen:

{"channel_index":[0,1],"duration":1.039875,"start":0.0,"is_final":false,"channel":{"alternatives":[{"transcript":"another big","confidence":0.9600255,"words":[{"word":"another","start":0.2971154,"end":0.7971154,"confidence":0.9588303},{"word":"big","start":0.85173076,"end":1.039875,"confidence":0.9600255}]}]},"metadata":{"request_id":"NkhdNwwDJWBlOvhcFlutX1g8iqmZSLz6"}}
{"channel_index":[0,1],"duration":2.039875,"start":0.0,"is_final":false,"channel":{"alternatives":[{"transcript":"another big problem","confidence":0.9939844,"words":[{"word":"another","start":0.29852942,"end":0.7985294,"confidence":0.9939844},{"word":"big","start":0.8557843,"end":1.3557843,"confidence":0.98220366},{"word":"problem","start":1.5722549,"end":2.039875,"confidence":0.9953441}]}]},"metadata":{"request_id":"NkhdNwwDJWBlOvhcFlutX1g8iqmZSLz6"}}
...