Getting Started with Pre-recorded Audio
¿Prefieres español? Ver Tutorial del API endpoint para audios pre-grabados.
In this guide, you’ll learn how to automatically transcribe pre-recorded audio files using Deepgram’s SDKs, which are supported for use with the Deepgram API.
Before You Begin
Before you run the code, you’ll need to do a few things.
Create a Deepgram Account
Before you can use Deepgram products, you’ll need to create a Deepgram account. Signup is free and includes:
- $150 in credit, which gives you access to:
- all base models
- pre-recorded and live streaming functionality
- all features
Create a Deepgram API Key
To access Deepgram’s API, you’ll need to create a Deepgram API Key. Make note of your API Key; you will need it later.
Configure Environment
We provide sample scripts in Python and Node.js and assume you have already configured either a Python or Node development environment. System requirements will vary depending on the programming language you use:
- Node.js: node >= 14.14.37
- Python: python >= 3.7
If you get stuck at any point, help is just a click away! Contact Support.
Transcribe Audio
Once you have your API Key, it’s time to transcribe audio! The instructions below will guide you through the process of creating a sample application, installing the Deepgram SDK, configuring code with your own Deepgram API Key and pre-recorded audio to transcribe, and finally, building and running the application.
Choose an Audio File
Download our sample audio file, or record your own using your device’s microphone.
Install the SDK
Open your terminal, navigate to the location on your drive where you want to create your project, and install the Deepgram SDK.
Write the Code
In your terminal, create a new file in your project’s location, and populate it with code.
Be sure to replace YOUR_DEEPGRAM_API_KEY
, YOUR_FILE_LOCATION
, AND YOUR_FILE_MIME_TYPE
with your Deepgram API Key, the location of the file you want to transcribe, and the mime type of the file you want to transcribe, respectively.
Start the application
Run your application from the terminal.
See results
Your transcripts will appear in your browser’s developer console.
Deepgram does not store transcripts, so the Deepgram API response is the only opportunity to retrieve the transcript. Make sure to save output or return transcriptions to a callback URL for custom processing.
Analyze the Response
When the file is finished processing (often after only a few seconds), you’ll receive a JSON response:
{
"metadata":{
"transaction_key":"Ha0aVG...",
"request_id":"se24UY...",
"sha256":"2d5b81...",
"created":"2021-07-08T09:11:38.593Z",
"duration":19.0,
"channels":1
},
"results":{
"channels":[
{
"alternatives":[
{
"transcript":"Yep. I said it before, and I'll say it again. Life moves pretty fast. You don't stop and look around once in a while. You could miss it. Thank.",
"confidence":0.9757011,
"words":[
{
"word":"yep",
"start":5.66,
"end":5.94,
"confidence":0.994987,
"punctuated_word":"Yep."
},
{
"word":"i",
"start":7.2344832,
"end":7.434014,
"confidence":0.8217165,
"punctuated_word":"I"
},
{
"word":"said",
"start":7.434014,
"end":7.5537324,
"confidence":0.979774,
"punctuated_word":"said"
},
...
]
}
]
}
]
}
}
In this default response, we see:
-
transcript
: the transcript for the audio segment being processed. -
confidence
: a floating point value between 0 and 1 that indicates overall transcript reliability. Larger values indicate higher confidence. -
words
: an object containing eachword
in the transcript, along with itsstart
time andend
time (in seconds) from the beginning of the audio stream, and aconfidence
value.Because we passed the
punctuate: true
option to thetranscription.prerecorded
method, each word object also includes itspunctuated_word
value, which contains the transformed word after punctuation and capitalization are applied.
By default, Deepgram applies its general
AI model, which is a good, general purpose model for everyday situations.
What’s Next?
Now that you’ve gotten a transcript for pre-recorded audio, enhance your knowledge by exploring the following areas.
Customize Transcripts
To customize the transcripts you receive, you can send a variety of parameters to the Deepgram API.
For example, if your audio is in Spanish rather than English, you can pass the language:
parameter with the es
option to the transcription.prerecorded
method in the previous examples:
To learn more about the languages available with Deepgram, see the Language feature guide. To learn more about the many ways you can customize your results with Deepgram’s API, check out the Deepgram API Reference.
Explore Use Cases
Time to learn about the different ways you can use Deepgram products to help you meet your business objectives. Explore Deepgram’s use cases.
Transcribe Streaming Audio
Now that you know how to transcribe pre-recorded audio, check out how you can use Deepgram to transcribe streaming audio in real time. To learn more, see Getting Started with Streaming Audio.