Pre-Recorded Audio

Transcribe audio using Deepgram’s speech-to-text REST API

Headers

AuthorizationstringRequired

Header authentication of the form Token <token>

Query parameters

callbackstringOptional
URL to which we'll make the callback request
callback_methodenumOptionalDefaults to POST
HTTP method by which the callback request will be made
Allowed values:
custom_topicstringOptional
Custom topics you want the model to detect within your input audio or text if present Submit up to 100
custom_topic_modeenumOptionalDefaults to extended

Sets how the model will interpret strings submitted to the custom_topic param. When strict, the model will only return topics submitted using the custom_topic param. When extended, the model will return its own detected topics in addition to those submitted using the custom_topic param

Allowed values:
custom_intentstringOptional
Custom intents you want the model to detect within your input audio if present
custom_intent_modeenumOptionalDefaults to extended

Sets how the model will interpret intents submitted to the custom_intent param. When strict, the model will only return intents submitted using the custom_intent param. When extended, the model will return its own detected intents in addition those submitted using the custom_intents param

Allowed values:
detect_entitiesbooleanOptionalDefaults to false
Identifies and extracts key entities from content in submitted audio
detect_languagebooleanOptionalDefaults to false
Identifies the dominant language spoken in submitted audio
diarizebooleanOptionalDefaults to false
Recognize speaker changes. Each word in the transcript will be assigned a speaker number starting at 0
dictationbooleanOptionalDefaults to false
Identify and extract key entities from content in submitted audio
encodingenumOptional
Specify the expected encoding of your submitted audio
extrastringOptional

Arbitrary key-value pairs that are attached to the API response for usage in downstream processing

filler_wordsbooleanOptionalDefaults to false
Filler Words can help transcribe interruptions in your audio, like "uh" and "um"
intentsbooleanOptionalDefaults to false
Recognizes speaker intent throughout a transcript or text
keytermstringOptional

Key term prompting can boost or suppress specialized terminology and brands. Only compatible with Nova-3

keywordsstringOptional
Keywords can boost or suppress specialized terminology and brands
languageenumOptionalDefaults to en

The BCP-47 language tag that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available

measurementsbooleanOptionalDefaults to false
Spoken measurements will be converted to their corresponding abbreviations
mip_opt_outbooleanOptionalDefaults to false

Opts out requests from the Deepgram Model Improvement Program. Refer to our Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip

modelstring or optional enumOptional
AI model used to process submitted audio
multichannelbooleanOptionalDefaults to false
Transcribe each audio channel independently
numeralsbooleanOptionalDefaults to false
Numerals converts numbers from written format to numerical format
paragraphsbooleanOptionalDefaults to false
Splits audio into paragraphs to improve transcript readability
profanity_filterbooleanOptionalDefaults to false

Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely

punctuatebooleanOptionalDefaults to false
Add punctuation and capitalization to the transcript
redactstringOptional
Redaction removes sensitive information from your transcripts
replacestringOptional
Search for terms or phrases in submitted audio and replaces them
searchstringOptional
Search for terms or phrases in submitted audio
sentimentbooleanOptionalDefaults to false
Recognizes the sentiment throughout a transcript or text
smart_formatbooleanOptionalDefaults to false
Apply formatting to transcript output. When set to true, additional formatting will be applied to transcripts to improve readability
summarizeenumOptional
Summarize content. For Listen API, supports string version option. For Read API, accepts boolean only.
Allowed values:
tagstringOptional
Label your requests for the purpose of identification during usage reporting
topicsbooleanOptionalDefaults to false
Detect topics throughout a transcript or text
utterancesbooleanOptionalDefaults to false
Segments speech into meaningful semantic units
utt_splitdoubleOptionalDefaults to 0.8
Seconds to wait before detecting a pause between words in submitted audio
versionstring or optional enumOptional
Version of an AI model to use

Request

This endpoint expects an object or a string.
objectRequired
OR
stringRequiredformat: "binary"

Response

Successful transcription
metadataobject
resultsobject

Errors