Quickstart: Get Started with Streaming Audio

Last updated 10/20/2021

In this quickstart, you'll learn how to automatically transcribe streaming audio in real-time using Deepgram's SDKs, which are supported for use with the Beta Deepgram API.

Already spoken with our sales team? Your account team should have reached out with a dedicated signup link. If not, talk to sales.

Go to the enterprise docs

Not a developer? Check out the Deepgram Console for a no-code way to get started with Deepgram's API.

The examples in this guide use Deepgram SDKs. To learn how to stream real-time audio to Deepgram using example Python or Node scripts, see Stream Audio to Deepgram.

Before You Begin

Before you run the code, you'll need to do a few things:

Before you can use Deepgram products, you'll need to create a Deepgram account. Signup is free and includes:

  • $150 in credit, which gives you access to:
    • all base models
    • pre-recorded and streaming functionality
    • all features

To access Deepgram’s API, you'll need to create a Deepgram API Key. Make note of your API Key; you will need it later.

We provide sample scripts in Python and Node.js and assume you have already configured either a Python or Node development environment.

If you get stuck at any point, help is just a click away! Contact Support.

Transcribe Audio

Once you have your API Key, it's time to transcribe audio! The instructions below will guide you through the process of creating a sample application, installing the Deepgram SDK, configuring code with your own Deepgram API Key and streaming audio to transcribe, and finally, building and running the application.

  1. Create a Sample Application

    Open your terminal, navigate to the location on your drive where you want to create your project, and initialize a new application:

    # Initialize a new application
    npm init
    
  2. Choose an Audio File

    Download our sample audio file, or record your own using your device’s microphone. Make sure downloaded files are in your project directory.

  3. Install the SDK

    In your terminal, install the Deepgram SDK:

    # Install the Deepgram Python SDK
    # https://github.com/deepgram/python-sdk
    pip install deepgram-sdk
    
    # Install the Deepgram Node.js SDK
    # https://github.com/deepgram/node-sdk
    npm install @deepgram/sdk
    
  4. Write the code

    In your terminal, create a new file and populate it with code.

    Create a new file called deepgram_test.py in your project's location. Populate this file:

    from deepgram import Deepgram
    import asyncio, json
    
    # Your Deepgram API Key
    DEEPGRAM_API_KEY = 'YOUR_DEEPGRAM_API_KEY'
    
    # Name and extension of the file you downloaded (e.g., sample.wav)
    PATH_TO_FILE = 'FILENAME_TO_TRANSCRIBE'
    
    async def main():
      # Initialize the Deepgram SDK
      dg_client = Deepgram(DEEPGRAM_API_KEY)
    
      # Create a websocket connection to Deepgram
      try:
        socket = dg_client.transcription.live({ punctuate: true })
      except Exception as e:
        print(f'Could not open socket: {e}')
        return
    
      # Handle sending audio to the socket
      async def process_audio(connection):
        # Open the file
        with open(PATH_TO_FILE, 'rb') as audio:
          # Chunk up the audio to send
          CHUNK_SIZE_BYTES = 8192
          CHUNK_RATE_SEC = 0.001
          chunk = audio.read(CHUNK_SIZE_BYTES)
          while chunk:
              connection.send(chunk)
              await asyncio.sleep(CHUNK_RATE_SEC)
              chunk = audio.read(CHUNK_SIZE_BYTES)
    
        # Indicate that we've finished sending data
        await connection.finish()
    
      # Listen for the connection to close
      socket.registerHandler(socket.event.CLOSE, lambda c: print(f'Connection closed with code {c}.'))
    
      # Print incoming transcription objects
      socket.registerHandler(socket.event.TRANSCRIPT_RECEIVED, print)
    
      # Send the audio to the socket
      await process_audio(socket)
    
    asyncio.run(main())
    

    Create a new file called index.js in your project's location. Populate this file:

    const { Deepgram } = require('@deepgram/sdk');
    
    /** Your Deepgram API Key*/
    const deepgramApiKey = 'YOUR_DEEPGRAM_API_KEY';
    
    /** Name and extension of the file you downloaded (e.g., sample.wav) */
    const pathToFile = 'FILENAME_TO_TRANSCRIBE';
    
    /** Initialize the Deepgram SDK */
    const deepgram = new Deepgram(deepgramApiKey);
    
    /** Create a websocket connection to Deepgram */
    const deepgramSocket = deepgram.transcription.live({ 'punctuate': True });
    
    /** Listen for the connection to open and begin sending */
    deepgramSocket.addListener('open', () => {
      console.log("Connection opened!");
    
      /** Grab your audio file */
      const fs = require('fs');
      const contents = fs.readFileSync(pathToFile);
    
      /** Send the audio to the Deepgram API in chunks of 1000 bytes */
      const chunk_size = 1000;
      for (i = 0; i < contents.length; i+= chunk_size) {
        const slice = contents.slice(i, i + chunk_size);
        deepgramSocket.send(slice);
      }
    
      /** Close the websocket connection */
      deepgramSocket.close();
    });
    
    /** Listen for the connection to close */
    deepgramSocket.addListener('close', () => {
      console.log('Connection closed.');
    })
    
    /** 
    * Receive transcripts based on sent streams and
    * write them to the console
    */
    deepgramSocket.addListener("transcriptReceived", (transcription) => {
      console.log(transcription.data);
    });
    

    Be sure to replace YOUR_DEEPGRAM_API_KEY and FILENAME_TO_TRANSCRIBE with your Deepgram API Key and the name of the file you downloaded.

  5. Start the Application

    Run the application from the terminal:

    python deepgram_test.py
    
    node index.js
    
  6. See Results

    Your transcripts will appear in your browser's developer console.

    When analyzing results, understand that real-time streaming returns a series of interim transcripts followed by a final transcript. To learn more about interim and final transcripts, see Understand Interim Transcripts.

What's Next?

Now that you've gotten transcripts for streaming audio, enhance your knowledge by exploring the following areas.

Customize Transcripts

To customize the transcripts you receive, you can send a variety of parameters to the Deepgram API.

For example, if you would like to use the phonecall model rather than the general model, you can pass the model: phonecall option to the transcription.live method in the previous examples:

      socket = dg_client.transcription.live({'punctuate': True, 'model': 'phonecall'})
    const deepgramSocket = deepgram.transcription.live({ punctuate: true, model: phonecall });

To learn more about the many ways you can customize your results with Deepgram's API, check out the Deepgram API Reference.

Explore Use Cases

Time to learn about the different ways you can use Deepgram products to help you meet your business objectives. Explore Deepgram's use cases.

Transcribe Pre-recorded Audio

Now that you know how to transcribe streaming audio, check out how you can use Deepgram to transcribe pre-recorded audio. To learn more, see Quickstart: Get Started with Pre-recorded Audio.