Transcribing Pre-recorded Audio
In this guide, you’ll learn how to automatically transcribe pre-recorded audio from both a local file and a remote file using Deepgram’s off-the-shelf general purpose AI model. We will also show you how to use some basic parameters to customize your transcript.
The examples in this guide use curl rather than Deepgram SDKs. To learn how to transcribe pre-recorded audio with Deepgram using our SDKs, see Getting Started with Pre-recorded Audio.
Before You Begin
Create a Deepgram Account
Before you can use Deepgram products, you'll need to create a Deepgram account. Signup is free and includes:
- $200 in credit, which gives you access to:
- all base models
- pre-recorded and streaming functionality
- all features
Create a Deepgram API Key
To access Deepgram’s API, you'll need to create a Deepgram API Key. Make note of your API Key; you will need it later.
Transcribe Audio
Deepgram’s API allows you to transcribe both local files and remote files that are publicly accessible.
If you don’t have an audio file of your own, you can use our sample WAV file.
Depending on the location of the audio you would like to transcribe, run one of the following sample curl commands.
For easy-to-read output, we highly recommend that you run these requests through jq.
Transcribe a Local File
To transcribe audio from a local file on your computer, run the following in a terminal or your favorite API client.
Be sure to replace the placeholder
YOUR_DEEPGRAM_API_KEY
with the Deepgram API Key you created earlier in this tutorial.
curl \
--request POST \
--header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
--header 'Content-Type: audio/wav' \
--data-binary @youraudio.wav \
--url 'https://api.deepgram.com/v1/listen'
Transcribe a Remote File
To transcribe audio from a publicly-accessible remote file (e.g., hosted in AWS S3 or another server), run the following in a terminal or your favorite API client.
Be sure to replace the placeholder
YOUR_DEEPGRAM_API_KEY
with the Deepgram API Key you created earlier in this tutorial.
curl \
--request POST \
--header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
--header 'Content-Type: application/json' \
--data '{"url":"https://static.deepgram.com/examples/interview_speech-analytics.wav"}' \
--url 'https://api.deepgram.com/v1/listen'
Analyze Response
When the file is finished processing (often after only a few seconds), you’ll receive a JSON response:
{
"metadata": {
"transaction_key": "iqIt",
"request_id": "m9IBwpEGuLOd5fDVUcwRoojQeVDIc4wU",
"sha256": "6b198da276e1108a87e15674ba5e68f4893f85aa584ea96c2b0b5fe32e756bd9",
"created": "2020-05-01T18:19:17.153Z",
"duration": 2705.3577,
"channels": 1
},
"results": {
"channels": [
{
"alternatives": [
{
"transcript": "hey natalie just joined",
"confidence": 0.87023026,
"words": [
{
"word": "hey",
"start": 35.61904,
"end": 35.77853,
"confidence": 0.54808563
},
{
"word": "natalie",
"start": 35.77853,
"end": 36.27853,
"confidence": 0.41259128
},
...
}
]
}
]
}
]
}
}
In this default response, we see:
transcript
: the transcript for the audio segment being processed.confidence
: a floating point value between 0 and 1 that indicates overall transcript reliability. Larger values indicate higher confidence.words
: an object containing eachword
in the transcript, along with itsstart
time andend
time (in seconds) from the beginning of the audio stream, and aconfidence
value.
By default, Deepgram applies its general
AI model, which is a good, general purpose model for everyday situations.
Customize Transcripts
To customize transcripts, you can add a variety of parameters to the query string.
For example, if you would like to use the phonecall
AI model rather than the general
AI model, you can add ?model=phonecall
to the URL in the previous examples:
--url 'https://api.deepgram.com/v1/listen?model=phonecall'
Similarly, if you'd like to get a transcript with punctuation and capitalization, you can add ?punctuate=true
to the URL in the previous examples:
--url 'https://api.deepgram.com/v1/listen?punctuate=true'
To learn more about the customization possible with Deepgram's API, check out the Deepgram API Reference.
Updated 18 days ago