Migrating from AssemblyAI Speech-to-Text to Deepgram

A step-by-step guide for developers to migrate from AssemblyAI to Deepgram Speech-to-Text.

This guide provides a detailed step-by-step process for developers transitioning from AssemblyAI speech-to-text (STT) services to Deepgram’s STT services using the Deepgram SDKs. The goal is to ensure a smooth migration by highlighting differences and demonstrating equivalent functionalities between the two platforms.

Getting Started

Before you can use Deepgram, you’ll need to create a Deepgram account. Signup is free and includes $200 in free credit and access to all of Deepgram’s features!

Before you start, you’ll need to follow the steps in the Make Your First API Request guide to obtain a Deepgram API key, and configure your environment if you are choosing to use a Deepgram SDK.

Prerequisites

Before proceeding with the migration, ensure you meet the following prerequisites:

Required Tools

  • A code editor (e.g., Visual Studio Code)
  • Terminal or command prompt access
  • Node / Python installed

API Keys

  • AssemblyAI API Key: Obtain from your AssemblyAI dashboard.
  • Deepgram API Key: Sign up on Deepgram’s platform and get your API key in the Deepgram Console.

Overview of AssemblyAI and Deepgram APIs

Both AssemblyAI and Deepgram provide robust speech-to-text APIs, but they have different endpoints, request parameters, and response structures. This guide will map AssemblyAI functionalities to their Deepgram equivalents.

Step-by-Step Migration Instructions

1. Setting Up the Environment

Node: Ensure you have Node.js installed. If not, download and install it from the Node.js website.

Python: Ensure you have Python installed. If not, download and install it from the Python website.

2. Configuring API Keys

How to configure AssemblyAI API key

Create a .env file in your project directory and add your AssemblyAI API key:

.env
ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here

How to configure Deepgram API key

Similarly, add your Deepgram API key to the .env file:

.env
DEEPGRAM_API_KEY=your_deepgram_api_key_here

3. Installing the SDK and Dependencies

For AssemblyAI:

$npm install assemblyai dotenv

For Deepgram:

$npm install @deepgram/sdk dotenv

4. Making API Requests

Initialization

AssemblyAI Initialization:

1import { AssemblyAI } from "assemblyai";
2import dotenv from "dotenv";
3
4dotenv.config();
5
6const client = new AssemblyAI({
7 apiKey: process.env.ASSEMBLYAI_API_KEY,
8});

Deepgram Initialization:

1import { Deepgram } from "@deepgram/sdk";
2import dotenv from "dotenv";
3
4dotenv.config();
5
6const deepgram = new Deepgram(process.env.DEEPGRAM_API_KEY);

Add Request Parameters

AssemblyAI:

1const data = {
2 audio_url: "https://dpgr.am/spacewalk.wav", // the audio_url for the audio being transcribed is included
3 speech_model: "nano",
4 speaker_labels: true,
5};

Deepgram:

1const options = {
2 model: "nova-3",
3 smart_format: true,
4 // Do not include the audio_url in this object
5};

Example: Transcribe Audio Using a Remote URL

Here is the entire code sample that shows how to transcribe audio using a remote URL.

AssemblyAI:

1import { AssemblyAI } from "assemblyai";
2import dotenv from "dotenv";
3dotenv.config();
4
5const client = new AssemblyAI({
6 apiKey: process.env.ASSEMBLYAI_API_KEY,
7});
8
9const FILE_URL = "https://dpgr.am/spacewalk.wav";
10
11const data = {
12 audio_url: FILE_URL,
13 speech_model: "nano",
14 speaker_labels: true,
15};
16
17const run = async () => {
18 const response = await client.transcripts.transcribe(data);
19 console.log(JSON.stringify(response));
20};
21
22run();

Deepgram:

1import { createClient } from "@deepgram/sdk";
2import dotenv from "dotenv";
3dotenv.config();
4
5const data = {
6 url: "https://dpgr.am/spacewalk.wav",
7};
8
9const options = {
10 model: "nova-3",
11 diarize: true,
12};
13
14const run = async () => {
15 const deepgram = createClient(process.env.DEEPGRAM_API_KEY);
16
17 const response = await deepgram.listen.prerecorded.transcribeUrl(
18 data,
19 options
20 );
21 console.dir(JSON.stringify(response), { depth: null });
22};
23
24run();

Example: Transcribe Audio Using a Local File

Here is the entire code sample that shows how to transcribe audio using a local file.

AssemblyAI:

1import { AssemblyAI } from "assemblyai";
2import dotenv from "dotenv";
3dotenv.config();
4
5const client = new AssemblyAI({
6 apiKey: process.env.ASSEMBLYAI_API_KEY,
7});
8
9const AUDIO_FILE = "sample.wav";
10
11const data = {
12 audio: AUDIO_FILE,
13 speech_model: "nano",
14 speaker_labels: true,
15};
16
17const run = async () => {
18 const response = await client.transcripts.transcribe(data);
19 console.log(JSON.stringify(response));
20};
21
22run();

Deepgram:

1import { createClient } from "@deepgram/sdk";
2import fs from "fs";
3import dotenv from "dotenv";
4dotenv.config();
5
6const deepgram = createClient(process.env.DEEPGRAM_API_KEY);
7
8const data = fs.readFileSync("sample.wav");
9
10const options = {
11 model: "nova-3",
12 diarize: true,
13};
14
15const run = async () => {
16 const response = await deepgram.listen.prerecorded.transcribeFile(
17 data,
18 options
19 );
20 console.dir(JSON.stringify(response), { depth: null });
21};
22
23run();

7. Handling Responses

Compare the JSON responses:

AssemblyAI:

JSON
1{
2 "id": "some_id",
3 "status": "completed",
4 "audio_url": "https://dpgr.am/spacewalk.wav",
5 "text": "Transcript text here...",
6 "words": [
7 {
8 "start": 255,
9 "end": 767,
10 "text": "Yeah.",
11 "confidence": 0.97465,
12 "speaker": null
13 },
14 ]
15}

Deepgram:

JSON
1{
2 "metadata": {
3 "transaction_key": "deprecated",
4 "request_id": "unique_request_id",
5 "created": "2024-02-06T19:56:16.180Z",
6 "duration": 25.933313,
7 "channels": 1,
8 "models": ["1abfe86b-e047-4eed-858a-35e5625b41ee"],
9 "model_info": {}
10 },
11 "results": {
12 "channels": [
13 {
14 "alternatives": [
15 {
16 "transcript": "Transcript text here...",
17 "confidence": 0.99902344,
18 "words": [
19 {
20 "word": "yeah",
21 "start": 0.08,
22 "end": 0.32,
23 "confidence": 0.9975586,
24 "punctuated_word": "Yeah."
25 }
26 ]
27 }
28 ]
29 }
30 ]
31 }
32}

8. Code Migration

Adapting code to handle Deepgram’s response structure involves accessing nested fields within the JSON response. For instance, response.results.channels[0].alternatives[0].transcript will give you the transcript text.

Be sure to update your data parsing logic to correctly navigate the nested response format, and thoroughly test the new code to ensure it handles various edge cases and accurately extracts the needed information.

9. Testing and Validation

Steps to test the integration

  1. Run the application to generate transcriptions.
  2. Validate that the responses match expected outputs.

Validating transcription accuracy and performance

  1. Compare the transcript text from both AssemblyAI and Deepgram.
  2. Evaluate the confidence scores and accuracy of the transcriptions.

Conclusion

Migrating from AssemblyAI to Deepgram involves making a few changes to how you initialize and use the SDK, and handling different response structures. By following this guide, you should be able to seamlessly transition your JavaScript-based STT applications to utilize Deepgram’s STT services. For more advanced features, refer to Deepgram’s documentation and explore additional functionalities that can enhance your application’s capabilities.


What’s Next

Built with