Live Streaming Audio Quickstart

A quick introduction to getting transcription data from live streaming audio in real time using a web application, Deepgram's API, and Deepgram SDKs.

๐ŸŒˆ

Notebook

Prefer the workflow of a Python notebook? Download our Python starter notebook and be up and running quickly without having to copy or paste any code.

Streaming audio notebook

In this guide, you'll learn how to automatically transcribe live streaming audio in real time using Deepgram's SDKs, which are supported for use with the Deepgram API.

Before You Begin

Before you run the code, you'll need to do a few things.

Create a Deepgram Account

Before you can use Deepgram products, you'll need to create a Deepgram account. Signup is free and includes:

Create a Deepgram API Key

To access Deepgramโ€™s API, you'll need to create a Deepgram API Key. Make note of your API Key; you will need it later.

Configure Environment

We provide sample scripts in Python and Node.js and assume you have already configured either a Python or Node development environment. System requirements will vary depending on the programming language you use:

  • Node.js: node >= 14.14.37, cross-fetch >= 3.1.5
  • Python: python >= 3.7, aiohttp >= 3.8.1
  • NET: dotnet >= 6.0
  • GO: Go >= 1.18

:wave: If you get stuck at any point, help is just a click away. Contact.

Transcribe Audio

Once you have your API Key, it's time to transcribe audio!

๐Ÿ“˜

If you want a quick way to get up and running, our Python streaming test suite provides sample code to help you get started. Before beginning to build your own integration, we recommend running the test suite code at least once to ensure you can successfully stream sample audio to Deepgram.

If you'd like to follow a step-by-step tutorial, our instructions below will guide you through the process of creating a sample application, installing the Deepgram SDK, configuring code with your own Deepgram API Key and streaming audio to transcribe, and finally, building and running the application.

Install the SDK

Open your terminal, navigate to the location on your drive where you want to create your project, and install the Deepgram SDK:

# Install the Deepgram Python SDK
# https://github.com/deepgram/deepgram-python-sdk

pip install deepgram-sdk
# Initialize a new application

npm init

# Install the Deepgram Node.js SDK
# https://github.com/deepgram/node-sdk

npm install @deepgram/sdk
# Install the Deepgram .NET SDK
# https://github.com/deepgram/deepgram-dotnet-sdk
dotnet add package Deepgram
# Install the Deepgram Go SDK

go get github.com/deepgram-devs/deepgram-go-sdk

Add Dependencies

Add necessary external dependencies to your project.

# Install aiohttp: HTTP client/server for asyncio that allows you to write asynchronous clients and servers, and supports WebSockets.

pip install aiohttp
# Install cross-fetch: Platform-agnostic Fetch API with typescript support, a simple interface, and optional polyfill.

npm install cross-fetch
// In your .csproj file, add the Package Reference:

<ItemGroup>
    <PackageReference Include="Deepgram" Version="3.4.0" />
</ItemGroup>
go get github.com/gorilla/websocket

Write the Code

In your terminal, create a new file in your project's location, and populate it with code.

โ„น๏ธ

The following example includes the parameter model=nova, which tells the API to use Deepgram's most powerful and affordable model. Removing this parameter will result in the API using the default model, which is currently model=general.

It also includes Deepgram's Smart Formatting feature, smart_format=true. This will format currency amounts, phone numbers, email addresses, and more for enhanced transcript readability.

:sparkles: We recently released Nova-2 in Early Access. Read our Quickstart to learn more.

 # Example filename: deepgram_test.py

from deepgram import Deepgram
import asyncio
import aiohttp

# Your Deepgram API Key
DEEPGRAM_API_KEY = 'YOUR_DEEPGRAM_API_KEY'

# URL for the realtime streaming audio you would like to transcribe
URL = 'http://stream.live.vc.bbcmedia.co.uk/bbc_world_service'

async def main():
  # Initialize the Deepgram SDK
  deepgram = Deepgram(DEEPGRAM_API_KEY)

  # Create a websocket connection to Deepgram
  # In this example, punctuation is turned on, interim results are turned off, and language is set to UK English.
  try:
    deepgramLive = await deepgram.transcription.live({
      'smart_format': True,
      'interim_results': False,
      'language': 'en-US',
      'model': 'nova',
    })
  except Exception as e:
    print(f'Could not open socket: {e}')
    return

  # Listen for the connection to close
  deepgramLive.registerHandler(deepgramLive.event.CLOSE, lambda c: print(f'Connection closed with code {c}.'))

  # Listen for any transcripts received from Deepgram and write them to the console
  deepgramLive.registerHandler(deepgramLive.event.TRANSCRIPT_RECEIVED, print)

  # Listen for the connection to open and send streaming audio from the URL to Deepgram
  async with aiohttp.ClientSession() as session:
    async with session.get(URL) as audio:
      while True:
        data = await audio.content.readany()
        deepgramLive.send(data)

        # If no data is being sent from the live stream, then break out of the loop.
        if not data:
            break

  # Indicate that we've finished sending data by sending the customary zero-byte message to the Deepgram streaming endpoint, and wait until we get back the final summary metadata object
  await deepgramLive.finish()

# If running in a Jupyter notebook, Jupyter is already running an event loop, so run main with this line instead:
#await main()
asyncio.run(main())
// Example filename: index.js

const { Deepgram } = require("@deepgram/sdk");
const fetch = require("cross-fetch");

// Your Deepgram API Key
const deepgramApiKey = "YOUR_DEEPGRAM_API_KEY";

// URL for the audio you would like to stream
// URL for the example resource will change depending on whether user is outside or inside the UK
// Outside the UK
const url = "http://stream.live.vc.bbcmedia.co.uk/bbc_world_service";
// Inside the UK
// const url = 'http://stream.live.vc.bbcmedia.co.uk/bbc_radio_fourfm';

// Initialize the Deepgram SDK
const deepgram = new Deepgram(deepgramApiKey);

// Create a websocket connection to Deepgram
// In this example, punctuation is turned on, interim results are turned off, and language is set to UK English.
const deepgramLive = deepgram.transcription.live({
	smart_format: true,
	interim_results: false,
	language: "en-US",
	model: "nova",
});

// Listen for the connection to open and send streaming audio from the URL to Deepgram
fetch(url)
	.then((r) => r.body)
	.then((res) => {
		res.on("readable", () => {
			if (deepgramLive.getReadyState() == 1) {
				deepgramLive.send(res.read());
			}
		});
	});

// Listen for the connection to close
deepgramLive.addListener("close", () => {
	console.log("Connection closed.");
});

// Listen for any transcripts received from Deepgram and write them to the console
deepgramLive.addListener("transcriptReceived", (message) => {
	const data = JSON.parse(message);

	// Write the entire response to the console
	console.dir(data.channel, { depth: null });

	// Write only the transcript to the console
	//console.dir(data.channel.alternatives[0].transcript, { depth: null });
});
// Example filename: Program.cs

using Deepgram.CustomEventArgs;
using Deepgram.Models;
using System.Net.WebSockets;

const string DEEPGRAM_API_KEY = "YOUR_DEEPGRAM_API_KEY";
var credentials = new Credentials(DEEPGRAM_API_KEY);

var deepgramClient = new DeepgramClient(credentials);

using (var deepgramLive = deepgramClient.CreateLiveTranscriptionClient())
{
    deepgramLive.ConnectionOpened += HandleConnectionOpened;
    deepgramLive.ConnectionClosed += HandleConnectionClosed;
    deepgramLive.ConnectionError += HandleConnectionError;
    deepgramLive.TranscriptReceived += HandleTranscriptReceived;

    // Connection opened so start sending audio.
    async void HandleConnectionOpened(object? sender, ConnectionOpenEventArgs e)
    {
        byte[] buffer;

        # Be sure to 
        using (FileStream fs = File.OpenRead("YOUR_LOCAL_FILE"))
        {
            buffer = new byte[fs.Length];
            fs.Read(buffer, 0, (int)fs.Length);
        }

        var chunks = buffer.Chunk(1000);

        foreach (var chunk in chunks)
        {
            deepgramLive.SendData(chunk);
            await Task.Delay(50);
        }

        await deepgramLive.FinishAsync();
    }

    void HandleTranscriptReceived(object? sender, TranscriptReceivedEventArgs e)
    {
        if (e.Transcript.IsFinal && e.Transcript.Channel.Alternatives.First().Transcript.Length > 0) {
            var transcript = e.Transcript;
            Console.WriteLine($"[Speaker: {transcript.Channel.Alternatives.First().Words.First().Speaker}] {transcript.Channel.Alternatives.First().Transcript}");
        }
    }

    void HandleConnectionClosed(object? sender, ConnectionClosedEventArgs e)
    {
        Console.Write("Connection Closed");
    }

    void HandleConnectionError(object? sender, ConnectionErrorEventArgs e)
    {
        Console.WriteLine(e.Exception.Message);
    }

    var options = new LiveTranscriptionOptions() { Punctuate = true, Diarize = true, Encoding = Deepgram.Common.AudioEncoding.Linear16 };
    await deepgramLive.StartConnectionAsync(options);

    while (deepgramLive.State() == WebSocketState.Open) { }
}
// Example filename: main.go

package main

import (
	"bufio"
	"fmt"
	"log"
	"net/http"
	"reflect"
	"time"

	"github.com/Jeffail/gabs/v2"
	"github.com/deepgram-devs/deepgram-go-sdk/deepgram"
	"github.com/gorilla/websocket"
)

const (
	DEEPGRAM_API_KEY       = "DEEPGRAM_API_KEY"
	STREAM_URL             = "http://stream.live.vc.bbcmedia.co.uk/bbc_world_service"
	CHUNK_SIZE             = 1024 * 2
	TEN_MILLISECONDS_SLEEP = 10 * time.Millisecond
)

func main() {
	client := new(http.Client)

	dg := *deepgram.NewClient(DEEPGRAM_API_KEY)

	res, err := client.Get(STREAM_URL)
	if err != nil {
		log.Println("ERROR getting stream", err)
		return
	}
	defer res.Body.Close()

	fmt.Println("Stream is up and running ", reflect.TypeOf(res))

	reader := bufio.NewReader(res.Body)

	liveTranscriptionOptions := deepgram.LiveTranscriptionOptions{
		Language:  "en-US",
		Punctuate: true,
	}

	dgConn, _, err := dg.LiveTranscription(liveTranscriptionOptions)
	if err != nil {
		log.Println("ERROR creating LiveTranscription connection:", err)
		return
	}
	defer dgConn.Close()

	chunk := make([]byte, CHUNK_SIZE)

	go func() {
		for {
			_, message, err := dgConn.ReadMessage()
			if err != nil {
				log.Println("ERROR reading message:", err)
				return
			}

			jsonParsed, jsonErr := gabs.ParseJSON(message)
			if jsonErr != nil {
				log.Println("ERROR parsing JSON message:", err)
				return
			}
			log.Printf("recv: %s", jsonParsed.Path("channel.alternatives.0.transcript").String())
		}
	}()

	for {
		bytesRead, err := reader.Read(chunk)

		if err != nil {
			log.Println("ERROR reading chunk:", err)
			return
		}
		err = dgConn.WriteMessage(websocket.BinaryMessage, chunk[:bytesRead])
		if err != nil {
			log.Println("ERROR writing message:", err)
			return
		}
		time.Sleep(TEN_MILLISECONDS_SLEEP)
	}
}

:eyes: Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

Start the Application

Run the application from the terminal.

# Run your application using the file you created in the previous step
# Example: python deepgram_test.py

python YOUR_PROJECT_NAME.py
# Run your application using the file you created in the previous step
# Example: node index.js

node YOUR_PROJECT_NAME.js
# Run your application using the file you created in the previous step
# Example: dotnet run Program.cs

dotnet run YOUR_PROJECT_NAME.cs
# Run your application using the file you created in the previous step
# Example: go run main.go

go run YOUR_PROJECT_NAME.go

:eyes: Replace YOUR_PROJECT_NAME with the name of the file to which you saved the code in the previous step.

See Results

Your transcripts will appear in your browser's developer console.

๐Ÿ“˜

Deepgram does not store transcriptions. Make sure to save output or return transcriptions to a callback URL for custom processing.

By default, Deepgram live streaming looks for any deviation in the natural flow of speech and returns a finalized response at these places. To learn more about this feature, see Endpointing.

Deepgram live streaming can also return a series of interim transcripts followed by a final transcript. To learn more, see Interim Results.

โ„น๏ธ

Endpointing can be used with Deepgram's Interim Results feature. To compare and contrast these features, and to explore best practices for using them together, see Using Endpointing and Interim Results with Live Streaming Audio.

What's Next?

Now that you've gotten transcripts for streaming audio, enhance your knowledge by exploring the following areas.

You can also check out our Live Streaming API Reference for a list of all possible parameters.

Customize Transcripts

To customize the transcripts you receive, you can send a variety of parameters to the Deepgram API.

For example, if your audio is in Spanish rather than UK English, you can pass the language: parameter with the es option to the transcription.live method in the previous examples.

๐Ÿ“˜

Not all languages work with all available models. Be sure to check out the Languages page to see which models are compatible with which languages.

deepgramLive = await deepgram.transcription.live({
  'punctuate': True,
  'interim_results': False,
  'language': 'es'
})
const deepgramLive = deepgram.transcription.live({
	punctuate: true,
	interim_results: false,
	language: "es",
});
var options = new LiveTranscriptionOptions() { Punctuate = true, Diarize = true, Encoding = Deepgram.Common.AudioEncoding.Linear16 };
deepgramLive := deepgram.LiveTranscriptionOptions{
		Language:        "es",
		Punctuate:       true,
		Interim_results: false,
	}

To learn more about the languages available with Deepgram, see the Language feature guide. To learn more about the many ways you can customize your results with Deepgram's API, check out the Deepgram API Reference.

Add Your Audio

Ready to connect Deepgram to your own audio source? Start by reviewing how to determine your audio format and format your API request accordingly.

Then, you'll want to check out our streaming test suite. The streaming test suite is the perfect "102" introduction to integrating your own audio.

Explore Use Cases

Time to learn about the different ways you can use Deepgram products to help you meet your business objectives. Explore Deepgram's use cases.

Transcribe Pre-recorded Audio

Now that you know how to transcribe streaming audio, check out how you can use Deepgram to transcribe pre-recorded audio. To learn more, see Getting Started with Pre-recorded Audio.