Migrating from AssemblyAI Speech-to-Text to Deepgram

A step-by-step guide for developers to migrate from AssemblyAI to Deepgram Speech-to-Text.

This guide provides a detailed step-by-step process for developers transitioning from AssemblyAI speech-to-text (STT) services to Deepgram's STT services using the Deepgram SDKs. The goal is to ensure a smooth migration by highlighting differences and demonstrating equivalent functionalities between the two platforms.

Getting Started

📘

Before you can use Deepgram, you'll need to create a Deepgram account. Signup is free and includes $200 in free credit and access to all of Deepgram's features!

📘

Before you start, you'll need to follow the steps in the Make Your First API Request guide to obtain a Deepgram API key, and configure your environment if you are choosing to use a Deepgram SDK.

Prerequisites

Before proceeding with the migration, ensure you meet the following prerequisites:

Required Tools

  • A code editor (e.g., Visual Studio Code)
  • Terminal or command prompt access
  • Node / Python installed

API Keys

  • AssemblyAI API Key: Obtain from your AssemblyAI dashboard.
  • Deepgram API Key: Sign up on Deepgram's platform and get your API key in the Deepgram Console.

Overview of AssemblyAI and Deepgram APIs

Both AssemblyAI and Deepgram provide robust speech-to-text APIs, but they have different endpoints, request parameters, and response structures. This guide will map AssemblyAI functionalities to their Deepgram equivalents.

Step-by-Step Migration Instructions

1. Setting Up the Environment

Node: Ensure you have Node.js installed. If not, download and install it from the Node.js website.

Python: Ensure you have Python installed. If not, download and install it from the Python website.

2. Configuring API Keys

How to configure AssemblyAI API key

Create a .env file in your project directory and add your AssemblyAI API key:

ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here

How to configure Deepgram API key

Similarly, add your Deepgram API key to the .env file:

DEEPGRAM_API_KEY=your_deepgram_api_key_here

3. Installing the SDK and Dependencies

For AssemblyAI:

npm install assemblyai dotenv
pip install assemblyai python-dotenv

For Deepgram:

npm install @deepgram/sdk dotenv
pip install deepgram-sdk python-dotenv

4. Making API Requests

Initialization

AssemblyAI Initialization:

import { AssemblyAI } from "assemblyai";
import dotenv from "dotenv";

dotenv.config();

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY,
});
import assemblyai as aai
import os
from dotenv import load_dotenv

load_dotenv()

aai.settings.api_key = os.getenv("ASSEMBLYAI_API_KEY")

Deepgram Initialization:

import { Deepgram } from "@deepgram/sdk";
import dotenv from "dotenv";

dotenv.config();

const deepgram = new Deepgram(process.env.DEEPGRAM_API_KEY);
import os
from dotenv import load_dotenv

from deepgram import (
    DeepgramClient,
    PrerecordedOptions,
    FileSource,
)

load_dotenv()

API_KEY = os.getenv("DG_API_KEY")

deepgram = DeepgramClient(API_KEY)

Add Request Parameters

AssemblyAI:

const data = {
  audio_url: "https://dpgr.am/spacewalk.wav", // the audio_url for the audio being transcribed is included
  speech_model: "nano",
  speaker_labels: true,
};
config = aai.TranscriptionConfig(speech_model="nano", speaker_labels= True)

Deepgram:

const options = {
  model: "nova-2",
  smart_format: true,
  // Do not include the audio_url in this object
};
options = PrerecordedOptions(
            model="nova-2",
            smart_format=True,
        )

Example: Transcribe Audio Using a Remote URL

Here is the entire code sample that shows how to transcribe audio using a remote URL.

AssemblyAI:

import { AssemblyAI } from "assemblyai";
import dotenv from "dotenv";
dotenv.config();

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY,
});

const FILE_URL = "https://dpgr.am/spacewalk.wav";

const data = {
  audio_url: FILE_URL,
  speech_model: "nano",
  speaker_labels: true,
};

const run = async () => {
  const response = await client.transcripts.transcribe(data);
  console.log(JSON.stringify(response));
};

run();
import os
from dotenv import load_dotenv

import assemblyai as aai

load_dotenv()
aai.settings.api_key = os.getenv("ASSEMBLYAI_API_KEY")

FILE_URL = "https://dpgr.am/spacewalk.wav"

config = aai.TranscriptionConfig(speech_model="nano", speaker_labels= True)

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(FILE_URL,
  config=config)

if transcript.status == aai.TranscriptStatus.error:
    print(transcript.error)
else:
    print(transcript.text)

Deepgram:

import { createClient } from "@deepgram/sdk";
import dotenv from "dotenv";
dotenv.config();

const data = {
  url: "https://dpgr.am/spacewalk.wav",
};

const options = {
  model: "nova-2",
  diarize: true,
};

const run = async () => {
  const deepgram = createClient(process.env.DEEPGRAM_API_KEY);

  const response = await deepgram.listen.prerecorded.transcribeUrl(
    data,
    options
  );
  console.dir(JSON.stringify(response), { depth: null });
};

run();
import os
from dotenv import load_dotenv

from deepgram import (
    DeepgramClient,
    PrerecordedOptions,
)

load_dotenv()

API_KEY = os.getenv("DG_API_KEY")

AUDIO_URL = {
    "url": "https://dpgr.am/spacewalk.wav"
}


def main():
    try:
        deepgram = DeepgramClient(API_KEY)

        options = PrerecordedOptions(
            model="nova-2",
            smart_format=True,
        )

        response = deepgram.listen.prerecorded.v("1").transcribe_url(AUDIO_URL, options)

        print(response.to_json(indent=4))

    except Exception as e:
        print(f"Exception: {e}")


if __name__ == "__main__":
    main()

Example: Transcribe Audio Using a Local File

Here is the entire code sample that shows how to transcribe audio using a local file.

AssemblyAI:

import { AssemblyAI } from "assemblyai";
import dotenv from "dotenv";
dotenv.config();

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY,
});

const AUDIO_FILE = "sample.wav";

const data = {
  audio: AUDIO_FILE,
  speech_model: "nano",
  speaker_labels: true,
};

const run = async () => {
  const response = await client.transcripts.transcribe(data);
  console.log(JSON.stringify(response));
};

run();
import assemblyai as aai
import os
from dotenv import load_dotenv


def main():
    try:
        load_dotenv()
        aai.settings.api_key = os.getenv("ASSEMBLYAI_API_KEY")

        LOCAL_FILE = "spacewalk.wav"
        config = aai.TranscriptionConfig(speech_model="nano", speaker_labels=True)
        
        transcriber = aai.Transcriber()
        transcript = transcriber.transcribe(LOCAL_FILE, config=config)

        if transcript.status == aai.TranscriptStatus.error:
            print(transcript.error)
        else:
            print(transcript.text)

    except Exception as e:
        print(f"Exception: {e}")

if __name__ == "__main__":
    main()

Deepgram:

import { createClient } from "@deepgram/sdk";
import fs from "fs";
import dotenv from "dotenv";
dotenv.config();

const deepgram = createClient(process.env.DEEPGRAM_API_KEY);

const data = fs.readFileSync("sample.wav");

const options = {
  model: "nova-2",
  diarize: true,
};

const run = async () => {
  const response = await deepgram.listen.prerecorded.transcribeFile(
    data,
    options
  );
  console.dir(JSON.stringify(response), { depth: null });
};

run();
import os
from dotenv import load_dotenv

from deepgram import (
    DeepgramClient,
    PrerecordedOptions,
    FileSource,
)

load_dotenv()
API_KEY = os.getenv("DG_API_KEY")

AUDIO_FILE = "spacewalk.wav"


def main():
    try:
        deepgram = DeepgramClient(API_KEY)

        with open(AUDIO_FILE, "rb") as file:
            buffer_data = file.read()

        payload: FileSource = {
            "buffer": buffer_data,
        }

        options = PrerecordedOptions(
            model="nova-2",
            smart_format=True,
        )

        response = deepgram.listen.prerecorded.v("1").transcribe_file(payload, options)

        print(response.to_json(indent=4))

    except Exception as e:
        print(f"Exception: {e}")


if __name__ == "__main__":
    main()

7. Handling Responses

Compare the JSON responses:

AssemblyAI:

{
  "id": "some_id",
  "status": "completed",
  "audio_url": "https://dpgr.am/spacewalk.wav",
  "text": "Transcript text here...",
   "words": [
    {
      "start": 255,
      "end": 767,
      "text": "Yeah.",
      "confidence": 0.97465,
      "speaker": null
    },
  ]
}

Deepgram:

{
  "metadata": {
    "transaction_key": "deprecated",
    "request_id": "unique_request_id",
    "created": "2024-02-06T19:56:16.180Z",
    "duration": 25.933313,
    "channels": 1,
    "models": ["1abfe86b-e047-4eed-858a-35e5625b41ee"],
    "model_info": {}
  },
  "results": {
    "channels": [
      {
        "alternatives": [
          {
            "transcript": "Transcript text here...",
            "confidence": 0.99902344,
            "words": [
              {
                "word": "yeah",
                "start": 0.08,
                "end": 0.32,
                "confidence": 0.9975586,
                "punctuated_word": "Yeah."
              }
            ]
          }
        ]
      }
    ]
  }
}

8. Code Migration

Adapting code to handle Deepgram's response structure involves accessing nested fields within the JSON response. For instance, response.results.channels[0].alternatives[0].transcript will give you the transcript text.

Be sure to update your data parsing logic to correctly navigate the nested response format, and thoroughly test the new code to ensure it handles various edge cases and accurately extracts the needed information.

9. Testing and Validation

Steps to test the integration

  1. Run the application to generate transcriptions.
  2. Validate that the responses match expected outputs.

Validating transcription accuracy and performance

  1. Compare the transcript text from both AssemblyAI and Deepgram.
  2. Evaluate the confidence scores and accuracy of the transcriptions.

Conclusion

Migrating from AssemblyAI to Deepgram involves making a few changes to how you initialize and use the SDK, and handling different response structures. By following this guide, you should be able to seamlessly transition your JavaScript-based STT applications to utilize Deepgram's STT services. For more advanced features, refer to Deepgram's documentation and explore additional functionalities that can enhance your application's capabilities.


What’s Next