Pre-Recorded Audio Transcription
An overview of the Deepgram Python SDK and Deepgram speech-to-text pre-recorded.
The Deepgram Pre-Recorded Clients
allows you to request transcripts for pre-recorded audio. To request a transcript for a pre-recorded particular audio file, you'll use one of the following functions depending on your audio source:
This SDK supports both the Threaded and Async/Await Clients as described in the Threaded and Async IO Task Support section. The code blocks contain a tab for
Threaded
andAsync
to show examples forprerecorded
versusasyncprerecorded
, respectively. The difference betweenThreaded
andAsync
is subtle.
Pre-recorded Transcription Parameters
Parameter | Type | Description |
---|---|---|
source | Buffer, Url | Provides the source of audio to transcribe |
options | Object | Parameters to filter requests. See below. |
You can pass a Buffer or URL to a file to transcribe. Here's how to construct each:
Sending a URL
source = {'url': URL_TO_AUDIO_FILE}
Sending a Buffer
Open a file and send the buffer returned.
with open(PATH_TO_FILE, 'rb') as audio:
source = {'buffer': audio}
async with aiofiles.open(filename, "rb") as audio:
source = {'buffer': audio}
Pre-recorded Transcription Options
Additional transcription options can be provided for pre-recorded transcriptions. They are provided as an object as the second parameter of the transcription.prerecorded
function. Each of these parameters maps to a feature in the Deepgram API. Reference the features documentation to learn the appropriate features for your request.
Pre-recorded Transcription Example Request
With the source you chose above, call the transcription function and provide any additional options as an object.
try:
# STEP 1 Create a Deepgram client using the DEEPGRAM_API_KEY from environment variables
deepgram = DeepgramClient()
# STEP 2 Call the transcribe_url method on the prerecorded class
options = PrerecordedOptions(
model="nova-2",
smart_format=True,
summarize="v2",
)
url_response = deepgram.listen.rest.v("1").transcribe_url(
AUDIO_URL, options
)
print(url_response)
except Exception as e:
print(f"Exception: {e}")
try:
# STEP 1 Create a Deepgram client using the DEEPGRAM_API_KEY from environment variables
deepgram = DeepgramClient()
# STEP 2 Call the transcribe_url method on the prerecorded class
options = PrerecordedOptions(
model="nova-2",
smart_format=True,
summarize="v2",
)
url_response = await deepgram.listen.asyncrest.v("1").transcribe_url(
AUDIO_URL, options
)
print(url_response)
except Exception as e:
print(f"Exception: {e}")
Increasing the Timeout for Processing Larger Files
You might need to increase the default HTTP Timeout setting for larger files. The example increases the time to 300 seconds (or 5 mins).
# this will increase the timeout to 300 seconds or 5 minutes
response = deepgram.listen.rest.v("1").transcribe_file(
payload, options, timeout=httpx.Timeout(300.0, connect=10.0)
)
# this will increase the timeout to 300 seconds or 5 minutes
response = await deepgram.listen.asyncrest.v("1").transcribe_file(
payload, options, timeout=httpx.Timeout(300.0, connect=10.0)
)
Where To Find Additional Examples
The SDK repository has a good collection of live audio transcription examples. The README contains links to them. Each example below attempts to provide different options for transcribing an audio source.
Some Examples
- Threaded Client using an Audio File - examples/speech-to-text/rest/file
- Threaded Client from a URL - examples/speech-to-text/rest/url
If the Async Client suits your use case better:
- Async Client from a URL - examples/speech-to-text/rest/async_url
Updated 5 months ago