Getting started

An introduction to getting transcription data from live streaming audio in real time using Deepgram's SDKs.

In this guide, you'll learn how to automatically transcribe live streaming audio in real time using Deepgram's SDKs, which are supported for use with the Deepgram API.

📘

Before you run the code, you'll need to follow the steps in this guide to create a Deepgram account, get a Deepgram API key, configure your environment, and install the SDK of your choice.

Transcribe Audio

Follow the steps to transcribe audio from a remote audio stream. If you would like to learn how to stream audio from a microphone, check out our Live Audio Starter Apps or specific examples in the readme of each of the Deepgram SDKs.

🌈

For those who prefer to work from a Jupyter notebook, check out our Python Starter Notebooks.

Install the SDK

Open your terminal, navigate to the location on your drive where you want to create your project, and install the Deepgram SDK:

# Install the Deepgram Python SDK
# https://github.com/deepgram/deepgram-python-sdk

pip install deepgram-sdk==3.*
# Install the Deepgram JavaScript SDK
# https://github.com/deepgram/deepgram-node-sdk

npm install @deepgram/sdk
# Install the Deepgram .NET SDK
# https://github.com/deepgram/deepgram-dotnet-sdk

dotnet add package Deepgram
# Install the Deepgram Go SDK
# https://github.com/deepgram/deepgram-go-sdk

go get github.com/deepgram/deepgram-go-sdk

Add Dependencies

Add necessary external dependencies to your project.

# Install python-dotenv to protect your API key

pip install python-dotenv
# Install cross-fetch: Platform-agnostic Fetch API with typescript support, a simple interface, and optional polyfill.
# Install dotenv to protect your api key

npm install cross-fetch dotenv
// In your .csproj file, add the Package Reference:

<ItemGroup>
    <PackageReference Include="Deepgram" Version="3.4.0" />
</ItemGroup>
# Importing the Deepgram Go SDK should pull in all dependencies required

Write the Code

In your terminal, create a new file in your project's location, and populate it with code.

# Example filename: main.py
import os
import httpx
from dotenv import load_dotenv
import threading

from deepgram import (
    DeepgramClient,
    LiveTranscriptionEvents,
    LiveOptions,
)

load_dotenv()

# URL for the realtime streaming audio you would like to transcribe
URL = "http://stream.live.vc.bbcmedia.co.uk/bbc_world_service"

API_KEY = os.getenv("DG_API_KEY")


def main():
    try:
        # STEP 1: Create a Deepgram client using the API key
        deepgram = DeepgramClient(API_KEY)

        # STEP 2: Create a websocket connection to Deepgram
        dg_connection = deepgram.listen.live.v("1")

        # STEP 3: Define the event handlers for the connection
        def on_message(self, result, **kwargs):
            sentence = result.channel.alternatives[0].transcript
            if len(sentence) == 0:
                return
            print(f"speaker: {sentence}")

        def on_metadata(self, metadata, **kwargs):
            print(f"\n\n{metadata}\n\n")

        def on_error(self, error, **kwargs):
            print(f"\n\n{error}\n\n")

        # STEP 4: Register the event handlers
        dg_connection.on(LiveTranscriptionEvents.Transcript, on_message)
        dg_connection.on(LiveTranscriptionEvents.Metadata, on_metadata)
        dg_connection.on(LiveTranscriptionEvents.Error, on_error)

        # STEP 5: Configure Deepgram options for live transcription
        options = LiveOptions(
            model="nova-2", 
            language="en-US", 
            smart_format=True,
            )
        
        # STEP 6: Start the connection
        dg_connection.start(options)

        # STEP 7: Create a lock and a flag for thread synchronization
        lock_exit = threading.Lock()
        exit = False

        # STEP 8: Define a thread that streams the audio and sends it to Deepgram
        def myThread():
            with httpx.stream("GET", URL) as r:
                for data in r.iter_bytes():
                    lock_exit.acquire()
                    if exit:
                        break
                    lock_exit.release()

                    dg_connection.send(data)

        # STEP 9: Start the thread
        myHttp = threading.Thread(target=myThread)
        myHttp.start()

        # STEP 10: Wait for user input to stop recording
        input("Press Enter to stop recording...\n\n")

        # STEP 11: Set the exit flag to True to stop the thread
        lock_exit.acquire()
        exit = True
        lock_exit.release()

        # STEP 12: Wait for the thread to finish
        myHttp.join()

        # STEP 13: Close the connection to Deepgram
        dg_connection.finish()

        print("Finished")

    except Exception as e:
        print(f"Could not open socket: {e}")
        return


if __name__ == "__main__":
    main()
// Example filename: index.js

const { createClient, LiveTranscriptionEvents } = require("@deepgram/sdk");
const fetch = require("cross-fetch");
const dotenv = require("dotenv");
dotenv.config();

// URL for the realtime streaming audio you would like to transcribe
const url = "http://stream.live.vc.bbcmedia.co.uk/bbc_world_service";

const live = async () => {
  // STEP 1: Create a Deepgram client using the API key
  const deepgram = createClient(process.env.DEEPGRAM_API_KEY);

  // STEP 2: Create a live transcription connection
  const connection = deepgram.listen.live({
    model: "nova-2",
    language: "en-US",
    smart_format: true,
  });

  // STEP 3: Listen for events from the live transcription connection
  connection.on(LiveTranscriptionEvents.Open, () => {
    connection.on(LiveTranscriptionEvents.Close, () => {
      console.log("Connection closed.");
    });

    connection.on(LiveTranscriptionEvents.Transcript, (data) => {
      console.log(data.channel.alternatives[0].transcript);
    });

    connection.on(LiveTranscriptionEvents.Metadata, (data) => {
      console.log(data);
    });

    connection.on(LiveTranscriptionEvents.Error, (err) => {
      console.error(err);
    });

    // STEP 4: Fetch the audio stream and send it to the live transcription connection
    fetch(url)
      .then((r) => r.body)
      .then((res) => {
        res.on("readable", () => {
          connection.send(res.read());
        });
      });
  });
};

live();

// Example filename: Program.cs

using Deepgram.CustomEventArgs;
using Deepgram.Models;
using System.Net.WebSockets;

const string DEEPGRAM_API_KEY = "YOUR_DEEPGRAM_API_KEY";
var credentials = new Credentials(DEEPGRAM_API_KEY);

var deepgramClient = new DeepgramClient(credentials);

using (var deepgramLive = deepgramClient.CreateLiveTranscriptionClient())
{
    deepgramLive.ConnectionOpened += HandleConnectionOpened;
    deepgramLive.ConnectionClosed += HandleConnectionClosed;
    deepgramLive.ConnectionError += HandleConnectionError;
    deepgramLive.TranscriptReceived += HandleTranscriptReceived;

    // Connection opened so start sending audio.
    async void HandleConnectionOpened(object? sender, ConnectionOpenEventArgs e)
    {
        byte[] buffer;

        using (FileStream fs = File.OpenRead("YOUR_LOCAL_FILE"))
        {
            buffer = new byte[fs.Length];
            fs.Read(buffer, 0, (int)fs.Length);
        }

        var chunks = buffer.Chunk(1000);

        foreach (var chunk in chunks)
        {
            deepgramLive.SendData(chunk);
            await Task.Delay(50);
        }

        await deepgramLive.FinishAsync();
    }

    void HandleTranscriptReceived(object? sender, TranscriptReceivedEventArgs e)
    {
        if (e.Transcript.IsFinal && e.Transcript.Channel.Alternatives.First().Transcript.Length > 0) {
            var transcript = e.Transcript;
            Console.WriteLine($"[Speaker: {transcript.Channel.Alternatives.First().Words.First().Speaker}] {transcript.Channel.Alternatives.First().Transcript}");
        }
    }

    void HandleConnectionClosed(object? sender, ConnectionClosedEventArgs e)
    {
        Console.Write("Connection Closed");
    }

    void HandleConnectionError(object? sender, ConnectionErrorEventArgs e)
    {
        Console.WriteLine(e.Exception.Message);
    }

    var options = new LiveTranscriptionOptions() { Punctuate = true, Diarize = true, Encoding = Deepgram.Common.AudioEncoding.Linear16 };
    await deepgramLive.StartConnectionAsync(options);

    while (deepgramLive.State() == WebSocketState.Open) { }
}
// Example filename: main.go
package main

import (
	"bufio"
	"context"
	"fmt"
	"net/http"
	"os"
	"reflect"

	interfaces "github.com/deepgram/deepgram-go-sdk/pkg/client/interfaces"
	client "github.com/deepgram/deepgram-go-sdk/pkg/client/live"
)

const (
	STREAM_URL = "http://stream.live.vc.bbcmedia.co.uk/bbc_world_service"
)

func main() {
	// STEP 1: init Deepgram client library
	client.InitWithDefault()

	// STEP 2: define context to manage the lifecycle of the request
	ctx := context.Background()

	// STEP 3: define options for the request
	transcriptOptions := interfaces.LiveTranscriptionOptions{
		Model: "nova-2",
		Language:  "en-US",
		SmartFormat: true,
	}

	// STEP 4: create a Deepgram client using default settings
	// NOTE: you can set your API KEY in your bash profile by typing the following line in your shell:
	// export DEEPGRAM_API_KEY = "YOUR_DEEPGRAM_API_KEY"
	dgClient, err := client.NewForDemo(ctx, transcriptOptions)
	if err != nil {
		fmt.Println("ERROR creating LiveTranscription connection:", err)
		return
	}

	// STEP 5: connect to the Deepgram service
	wsconn := dgClient.Connect()
	if wsconn == nil {
		fmt.Println("Client.Connect failed")
		os.Exit(1)
	}

	// STEP 6: create an HTTP client to stream audio data
	httpClient := new(http.Client)

	// STEP 7: create an HTTP stream
	res, err := httpClient.Get(STREAM_URL)
	if err != nil {
		fmt.Printf("httpClient.Get failed. Err: %v\n", err)
		return
	}

	fmt.Printf("Stream is up and running %s\n", reflect.TypeOf(res))

	go func() {
		// STEP 8: feed the HTTP stream to the Deepgram client (this is a blocking call)
		dgClient.Stream(bufio.NewReader(res.Body))
	}()

	// STEP 9: wait for user to exit
	fmt.Print("Press ENTER to exit!\n\n")
	input := bufio.NewScanner(os.Stdin)
	input.Scan()

	// STEP 10: close HTTP stream
	res.Body.Close()

	// STEP 11: close the Deepgram client
	dgClient.Stop()

	fmt.Printf("Program exiting...\n")
}

ℹ️

The above example includes the parameter model=nova-2, which tells the API to use Deepgram's most powerful and affordable model. Removing this parameter will result in the API using the default model, which is currently model=base.

It also includes Deepgram's Smart Formatting feature, smart_format=true. This will format currency amounts, phone numbers, email addresses, and more for enhanced transcript readability.

Start the Application

Run the application from the terminal.

# Run your application using the file you created in the previous step
# Example: python main.py

python YOUR_PROJECT_NAME.py
# Run your application using the file you created in the previous step
# Example: node index.js

node YOUR_PROJECT_NAME.js
# Run your application using the file you created in the previous step
# Example: dotnet run Program.cs

dotnet run YOUR_PROJECT_NAME.cs
# Run your application using the file you created in the previous step
# Example: go run main.go

go run YOUR_PROJECT_NAME.go

See Results

Your transcripts will appear in your browser's developer console. Keep in mind that Deepgram does not store transcriptions. Make sure to save the output or return transcriptions to a callback URL for custom processing.

By default, Deepgram live streaming looks for any deviation in the natural flow of speech and returns a finalized response at these places. To learn more about this feature, see Endpointing.

Deepgram live streaming can also return a series of interim transcripts followed by a final transcript. To learn more, see Interim Results.

ℹ️

Endpointing can be used with Deepgram's Interim Results feature. To compare and contrast these features, and to explore best practices for using them together, see Using Endpointing and Interim Results with Live Streaming Audio.

What's Next?

Now that you've gotten transcripts for streaming audio, enhance your knowledge by exploring the following areas. You can also check out our Live Streaming API Reference for a list of all possible parameters.

Try the Starter Apps

Clone and run one of our Live Audio Starter App repositories to see a full application with a frontend UI and a backend server streaming audio to Deepgram.

Read the Feature Guides

Deepgram's features help you to customize your transcripts. Do you want to transcribe audio in other languages? Check out the Language feature guide. Do you want to remove profanity from the transcript or redact personal information such as credit card numbers? Check out Profanity Filtering or Redaction.

Take a glance at our Feature Overview for streaming speech-to-text to see the list of all the features available. Then read more about each feature in its individual guide.

Add Your Audio

Ready to connect Deepgram to your own audio source? Start by reviewing how to determine your audio format and format your API request accordingly.

Then, you'll want to check out our Live Streaming Starter Kit. It's the perfect "102" introduction to integrating your own audio.

Explore Use Cases

Learn about the different ways you can use Deepgram products to help you meet your business objectives. Explore Deepgram's use cases.

Transcribe Pre-recorded Audio

Now that you know how to transcribe streaming audio, check out how you can use Deepgram to transcribe pre-recorded audio. To learn more, see Getting Started with Pre-recorded Audio.


What’s Next