Pre-Recorded Audio


Transcribe audio using Deepgram’s speech-to-text API

Query parameters


URL to which we’ll make the callback request

callback_methodenumOptionalDefaults to POST

HTTP method by which the callback request will be made

Allowed values: POSTPUT

Custom topics you want the model to detect within your input audio or text if present Submit up to 100

custom_topic_modeenumOptionalDefaults to extended

Sets how the model will interpret strings submitted to the custom_topic param. When strict, the model will only return topics submitted using the custom_topic param. When extended, the model will return its own detected topics in addition to those submitted using the custom_topic param

Allowed values: extendedstrict

Custom intents you want the model to detect within your input audio if present

custom_intent_modeenumOptionalDefaults to extended

Sets how the model will interpret intents submitted to the custom_intent param. When strict, the model will only return intents submitted using the custom_intent param. When extended, the model will return its own detected intents in addition those submitted using the custom_intents param

Allowed values: extendedstrict
detect_entitiesbooleanOptionalDefaults to false

Identifies and extracts key entities from content in submitted audio

detect_languagebooleanOptionalDefaults to false

Identifies the dominant language spoken in submitted audio

diarizebooleanOptionalDefaults to false

Recognize speaker changes. Each word in the transcript will be assigned a speaker number starting at 0

dictationbooleanOptionalDefaults to false

Identify and extract key entities from content in submitted audio


Specify the expected encoding of your submitted audio


Arbitrary key-value pairs that are attached to the API response for usage in downstream processing

filler_wordsbooleanOptionalDefaults to false

Filler Words can help transcribe interruptions in your audio, like “uh” and “um”

intentsbooleanOptionalDefaults to false

Recognizes speaker intent throughout a transcript or text


Key term prompting can boost or suppress specialized terminology and brands. Only compatible with Nova-3


Keywords can boost or suppress specialized terminology and brands

languageenumOptionalDefaults to en

The BCP-47 language tag that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available

measurementsbooleanOptionalDefaults to false

Spoken measurements will be converted to their corresponding abbreviations


AI model used to process submitted audio

multichannelbooleanOptionalDefaults to false

Transcribe each audio channel independently

numeralsbooleanOptionalDefaults to false

Numerals converts numbers from written format to numerical format

paragraphsbooleanOptionalDefaults to false

Splits audio into paragraphs to improve transcript readability

profanity_filterbooleanOptionalDefaults to false

Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely

punctuatebooleanOptionalDefaults to false

Add punctuation and capitalization to the transcript


Redaction removes sensitive information from your transcripts


Search for terms or phrases in submitted audio and replaces them


Search for terms or phrases in submitted audio

sentimentbooleanOptionalDefaults to false

Recognizes the sentiment throughout a transcript or text

smart_formatbooleanOptionalDefaults to false

Apply formatting to transcript output. When set to true, additional formatting will be applied to transcripts to improve readability


Summarize content. For Listen API, supports string version option. For Read API, accepts boolean only.

Allowed values: v2v1

Label your requests for the purpose of identification during usage reporting

topicsbooleanOptionalDefaults to false

Detect topics throughout a transcript or text

utterancesbooleanOptionalDefaults to false

Segments speech into meaningful semantic units

utt_splitdoubleOptionalDefaults to 0.8

Seconds to wait before detecting a pause between words in submitted audio


Version of an AI model to use

Allowed values: latest


This endpoint expects an object or a string.
Listen Request URLobject
Listen Request Filestring


Successful transcription



Built with