Migrating from AssemblyAI Speech-to-Text to Deepgram
A step-by-step guide for developers to migrate from AssemblyAI to Deepgram Speech-to-Text.
This guide provides a detailed step-by-step process for developers transitioning from AssemblyAI speech-to-text (STT) services to Deepgram's STT services using the Deepgram SDKs. The goal is to ensure a smooth migration by highlighting differences and demonstrating equivalent functionalities between the two platforms.
Getting Started
Before you can use Deepgram, you'll need to create a Deepgram account. Signup is free and includes $200 in free credit and access to all of Deepgram's features!
Before you start, you'll need to follow the steps in the Make Your First API Request guide to obtain a Deepgram API key, and configure your environment if you are choosing to use a Deepgram SDK.
Prerequisites
Before proceeding with the migration, ensure you meet the following prerequisites:
Required Tools
- A code editor (e.g., Visual Studio Code)
- Terminal or command prompt access
- Node / Python installed
API Keys
- AssemblyAI API Key: Obtain from your AssemblyAI dashboard.
- Deepgram API Key: Sign up on Deepgram's platform and get your API key in the Deepgram Console.
Overview of AssemblyAI and Deepgram APIs
Both AssemblyAI and Deepgram provide robust speech-to-text APIs, but they have different endpoints, request parameters, and response structures. This guide will map AssemblyAI functionalities to their Deepgram equivalents.
Step-by-Step Migration Instructions
1. Setting Up the Environment
Node: Ensure you have Node.js installed. If not, download and install it from the Node.js website.
Python: Ensure you have Python installed. If not, download and install it from the Python website.
2. Configuring API Keys
How to configure AssemblyAI API key
Create a .env
file in your project directory and add your AssemblyAI API key:
ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here
How to configure Deepgram API key
Similarly, add your Deepgram API key to the .env
file:
DEEPGRAM_API_KEY=your_deepgram_api_key_here
3. Installing the SDK and Dependencies
For AssemblyAI:
npm install assemblyai dotenv
pip install assemblyai python-dotenv
For Deepgram:
npm install @deepgram/sdk dotenv
pip install deepgram-sdk python-dotenv
4. Making API Requests
Initialization
AssemblyAI Initialization:
import { AssemblyAI } from "assemblyai";
import dotenv from "dotenv";
dotenv.config();
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY,
});
import assemblyai as aai
import os
from dotenv import load_dotenv
load_dotenv()
aai.settings.api_key = os.getenv("ASSEMBLYAI_API_KEY")
Deepgram Initialization:
import { Deepgram } from "@deepgram/sdk";
import dotenv from "dotenv";
dotenv.config();
const deepgram = new Deepgram(process.env.DEEPGRAM_API_KEY);
import os
from dotenv import load_dotenv
from deepgram import (
DeepgramClient,
PrerecordedOptions,
FileSource,
)
load_dotenv()
API_KEY = os.getenv("DG_API_KEY")
deepgram = DeepgramClient(API_KEY)
Add Request Parameters
AssemblyAI:
const data = {
audio_url: "https://dpgr.am/spacewalk.wav", // the audio_url for the audio being transcribed is included
speech_model: "nano",
speaker_labels: true,
};
config = aai.TranscriptionConfig(speech_model="nano", speaker_labels= True)
Deepgram:
const options = {
model: "nova-2",
smart_format: true,
// Do not include the audio_url in this object
};
options = PrerecordedOptions(
model="nova-2",
smart_format=True,
)
Example: Transcribe Audio Using a Remote URL
Here is the entire code sample that shows how to transcribe audio using a remote URL.
AssemblyAI:
import { AssemblyAI } from "assemblyai";
import dotenv from "dotenv";
dotenv.config();
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY,
});
const FILE_URL = "https://dpgr.am/spacewalk.wav";
const data = {
audio_url: FILE_URL,
speech_model: "nano",
speaker_labels: true,
};
const run = async () => {
const response = await client.transcripts.transcribe(data);
console.log(JSON.stringify(response));
};
run();
import os
from dotenv import load_dotenv
import assemblyai as aai
load_dotenv()
aai.settings.api_key = os.getenv("ASSEMBLYAI_API_KEY")
FILE_URL = "https://dpgr.am/spacewalk.wav"
config = aai.TranscriptionConfig(speech_model="nano", speaker_labels= True)
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(FILE_URL,
config=config)
if transcript.status == aai.TranscriptStatus.error:
print(transcript.error)
else:
print(transcript.text)
Deepgram:
import { createClient } from "@deepgram/sdk";
import dotenv from "dotenv";
dotenv.config();
const data = {
url: "https://dpgr.am/spacewalk.wav",
};
const options = {
model: "nova-2",
diarize: true,
};
const run = async () => {
const deepgram = createClient(process.env.DEEPGRAM_API_KEY);
const response = await deepgram.listen.prerecorded.transcribeUrl(
data,
options
);
console.dir(JSON.stringify(response), { depth: null });
};
run();
import os
from dotenv import load_dotenv
from deepgram import (
DeepgramClient,
PrerecordedOptions,
)
load_dotenv()
API_KEY = os.getenv("DG_API_KEY")
AUDIO_URL = {
"url": "https://dpgr.am/spacewalk.wav"
}
def main():
try:
deepgram = DeepgramClient(API_KEY)
options = PrerecordedOptions(
model="nova-2",
smart_format=True,
)
response = deepgram.listen.prerecorded.v("1").transcribe_url(AUDIO_URL, options)
print(response.to_json(indent=4))
except Exception as e:
print(f"Exception: {e}")
if __name__ == "__main__":
main()
Example: Transcribe Audio Using a Local File
Here is the entire code sample that shows how to transcribe audio using a local file.
AssemblyAI:
import { AssemblyAI } from "assemblyai";
import dotenv from "dotenv";
dotenv.config();
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY,
});
const AUDIO_FILE = "sample.wav";
const data = {
audio: AUDIO_FILE,
speech_model: "nano",
speaker_labels: true,
};
const run = async () => {
const response = await client.transcripts.transcribe(data);
console.log(JSON.stringify(response));
};
run();
import assemblyai as aai
import os
from dotenv import load_dotenv
def main():
try:
load_dotenv()
aai.settings.api_key = os.getenv("ASSEMBLYAI_API_KEY")
LOCAL_FILE = "spacewalk.wav"
config = aai.TranscriptionConfig(speech_model="nano", speaker_labels=True)
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(LOCAL_FILE, config=config)
if transcript.status == aai.TranscriptStatus.error:
print(transcript.error)
else:
print(transcript.text)
except Exception as e:
print(f"Exception: {e}")
if __name__ == "__main__":
main()
Deepgram:
import { createClient } from "@deepgram/sdk";
import fs from "fs";
import dotenv from "dotenv";
dotenv.config();
const deepgram = createClient(process.env.DEEPGRAM_API_KEY);
const data = fs.readFileSync("sample.wav");
const options = {
model: "nova-2",
diarize: true,
};
const run = async () => {
const response = await deepgram.listen.prerecorded.transcribeFile(
data,
options
);
console.dir(JSON.stringify(response), { depth: null });
};
run();
import os
from dotenv import load_dotenv
from deepgram import (
DeepgramClient,
PrerecordedOptions,
FileSource,
)
load_dotenv()
API_KEY = os.getenv("DG_API_KEY")
AUDIO_FILE = "spacewalk.wav"
def main():
try:
deepgram = DeepgramClient(API_KEY)
with open(AUDIO_FILE, "rb") as file:
buffer_data = file.read()
payload: FileSource = {
"buffer": buffer_data,
}
options = PrerecordedOptions(
model="nova-2",
smart_format=True,
)
response = deepgram.listen.prerecorded.v("1").transcribe_file(payload, options)
print(response.to_json(indent=4))
except Exception as e:
print(f"Exception: {e}")
if __name__ == "__main__":
main()
7. Handling Responses
Compare the JSON responses:
AssemblyAI:
{
"id": "some_id",
"status": "completed",
"audio_url": "https://dpgr.am/spacewalk.wav",
"text": "Transcript text here...",
"words": [
{
"start": 255,
"end": 767,
"text": "Yeah.",
"confidence": 0.97465,
"speaker": null
},
]
}
Deepgram:
{
"metadata": {
"transaction_key": "deprecated",
"request_id": "unique_request_id",
"created": "2024-02-06T19:56:16.180Z",
"duration": 25.933313,
"channels": 1,
"models": ["1abfe86b-e047-4eed-858a-35e5625b41ee"],
"model_info": {}
},
"results": {
"channels": [
{
"alternatives": [
{
"transcript": "Transcript text here...",
"confidence": 0.99902344,
"words": [
{
"word": "yeah",
"start": 0.08,
"end": 0.32,
"confidence": 0.9975586,
"punctuated_word": "Yeah."
}
]
}
]
}
]
}
}
8. Code Migration
Adapting code to handle Deepgram's response structure involves accessing nested fields within the JSON response. For instance, response.results.channels[0].alternatives[0].transcript
will give you the transcript text.
Be sure to update your data parsing logic to correctly navigate the nested response format, and thoroughly test the new code to ensure it handles various edge cases and accurately extracts the needed information.
9. Testing and Validation
Steps to test the integration
- Run the application to generate transcriptions.
- Validate that the responses match expected outputs.
Validating transcription accuracy and performance
- Compare the transcript text from both AssemblyAI and Deepgram.
- Evaluate the confidence scores and accuracy of the transcriptions.
Conclusion
Migrating from AssemblyAI to Deepgram involves making a few changes to how you initialize and use the SDK, and handling different response structures. By following this guide, you should be able to seamlessly transition your JavaScript-based STT applications to utilize Deepgram's STT services. For more advanced features, refer to Deepgram's documentation and explore additional functionalities that can enhance your application's capabilities.
Updated 5 months ago