Pre-Recorded Audio

POST

Transcribe audio using Deepgram’s speech-to-text API

Query parameters

callbackstringOptional

URL to which we’ll make the callback request

callback_methodenumOptionalDefaults to POST

HTTP method by which the callback request will be made

Allowed values: POSTPUT
custom_topicstringOptional

Custom topics you want the model to detect within your input audio or text if present Submit up to 100

custom_topic_modeenumOptionalDefaults to extended

Sets how the model will interpret strings submitted to the custom_topic param. When strict, the model will only return topics submitted using the custom_topic param. When extended, the model will return its own detected topics in addition to those submitted using the custom_topic param

Allowed values: extendedstrict
custom_intentstringOptional

Custom intents you want the model to detect within your input audio if present

custom_intent_modeenumOptionalDefaults to extended

Sets how the model will interpret intents submitted to the custom_intent param. When strict, the model will only return intents submitted using the custom_intent param. When extended, the model will return its own detected intents in addition those submitted using the custom_intents param

Allowed values: extendedstrict
detect_entitiesbooleanOptionalDefaults to false

Identifies and extracts key entities from content in submitted audio

detect_languagebooleanOptionalDefaults to false

Identifies the dominant language spoken in submitted audio

diarize_versionstringOptionalDefaults to v2

Version of the diarization feature to use. Only used when the diarization feature is enabled (diarize=true is passed to the API)

diarizebooleanOptionalDefaults to false

Recognize speaker changes. Each word in the transcript will be assigned a speaker number starting at 0

dictationbooleanOptionalDefaults to false

Identify and extract key entities from content in submitted audio

encodingenumOptional

Specify the expected encoding of your submitted audio

extrastringOptional

Arbitrary key-value pairs that are attached to the API response for usage in downstream processing

filler_wordsbooleanOptionalDefaults to false

Filler Words can help transcribe interruptions in your audio, like “uh” and “um”

intentsbooleanOptionalDefaults to false

Recognizes speaker intent throughout a transcript or text

keytermstringOptional

Key term prompting can boost or suppress specialized terminology and brands. Only compatible with Nova-3

keywordsstringOptional

Keywords can boost or suppress specialized terminology and brands

languageenumOptionalDefaults to en

The BCP-47 language tag that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available

measurementsbooleanOptionalDefaults to false

Spoken measurements will be converted to their corresponding abbreviations

modelenumOptional

AI model used to process submitted audio

multichannelbooleanOptionalDefaults to false

Transcribe each audio channel independently

numeralsbooleanOptionalDefaults to false

Numerals converts numbers from written format to numerical format

paragraphsbooleanOptionalDefaults to false

Splits audio into paragraphs to improve transcript readability

profanity_filterbooleanOptionalDefaults to false

Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely

punctuatebooleanOptionalDefaults to false

Add punctuation and capitalization to the transcript

redactstringOptional

Redaction removes sensitive information from your transcripts

replacestringOptional

Search for terms or phrases in submitted audio and replaces them

searchstringOptional

Search for terms or phrases in submitted audio

sentimentbooleanOptionalDefaults to false

Recognizes the sentiment throughout a transcript or text

smart_formatbooleanOptionalDefaults to false

Apply formatting to transcript output. When set to true, additional formatting will be applied to transcripts to improve readability

summarizeenumOptional

Summarize content. For Listen API, supports string version option. For Read API, accepts boolean only.

Allowed values: v2v1
tagstringOptional

Label your requests for the purpose of identification during usage reporting

topicsbooleanOptionalDefaults to false

Detect topics throughout a transcript or text

utterancesbooleanOptionalDefaults to false

Segments speech into meaningful semantic units

utt_splitdoubleOptionalDefaults to 0.8

Seconds to wait before detecting a pause between words in submitted audio

versionenumOptional

Version of an AI model to use

Allowed values: latest

Request

This endpoint expects an object or a string.
Listen Request URLobject
OR
abc
Listen Request Filestring

Response

Successful transcription

metadataobject
resultsobject

Errors

Built with