Pre-Recorded Audio | Deepgram's Docs

Transcribe audio using Deepgram’s speech-to-text REST API

Headers

AuthorizationstringRequired

Query parameters

callbackstringOptional

URL to which we'll make the callback request

callback_methodenumOptionalDefaults to POST

HTTP method by which the callback request will be made

Allowed values:

custom_topicstring or list of stringsOptional

Custom topics you want the model to detect within your input audio or text if present Submit up to 100

custom_topic_modeenumOptionalDefaults to extended

Sets how the model will interpret strings submitted to the custom_topic param. When strict, the model will only return topics submitted using the custom_topic param. When extended, the model will return its own detected topics in addition to those submitted using the custom_topic param

Allowed values:

custom_intentstring or list of stringsOptional

Custom intents you want the model to detect within your input audio if present

custom_intent_modeenumOptionalDefaults to extended

Sets how the model will interpret intents submitted to the custom_intent param. When strict, the model will only return intents submitted using the custom_intent param. When extended, the model will return its own detected intents in addition those submitted using the custom_intents param

Allowed values:

detect_entitiesbooleanOptionalDefaults to false

Identifies and extracts key entities from content in submitted audio

detect_languageboolean or list of stringsOptional

Identifies the dominant language spoken in submitted audio

diarizebooleanOptionalDefaults to false

Recognize speaker changes. Each word in the transcript will be assigned a speaker number starting at 0

dictationbooleanOptionalDefaults to false

Identify and extract key entities from content in submitted audio

encodingenumOptional

Specify the expected encoding of your submitted audio

extrastring or list of stringsOptional

Arbitrary key-value pairs that are attached to the API response for usage in downstream processing

filler_wordsbooleanOptionalDefaults to false

Filler Words can help transcribe interruptions in your audio, like "uh" and "um"

intentsbooleanOptionalDefaults to false

Recognizes speaker intent throughout a transcript or text

keytermlist of stringsOptional

Key term prompting can boost or suppress specialized terminology and brands. Only compatible with Nova-3

keywordsstring or list of stringsOptional

Keywords can boost or suppress specialized terminology and brands

languageenumOptionalDefaults to en

The BCP-47 language tag that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available

measurementsbooleanOptionalDefaults to false

Spoken measurements will be converted to their corresponding abbreviations

mip_opt_outbooleanOptionalDefaults to false

Opts out requests from the Deepgram Model Improvement Program. Refer to our Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip

modelenum or stringOptional

AI model used to process submitted audio

multichannelbooleanOptionalDefaults to false

Transcribe each audio channel independently

numeralsbooleanOptionalDefaults to false

Numerals converts numbers from written format to numerical format

paragraphsbooleanOptionalDefaults to false

Splits audio into paragraphs to improve transcript readability

profanity_filterbooleanOptionalDefaults to false

Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely

punctuatebooleanOptionalDefaults to false

Add punctuation and capitalization to the transcript

redactstring or list of enumsOptional

Redaction removes sensitive information from your transcripts

replacestring or list of stringsOptional

Search for terms or phrases in submitted audio and replaces them

searchstring or list of stringsOptional

Search for terms or phrases in submitted audio

sentimentbooleanOptionalDefaults to false

Recognizes the sentiment throughout a transcript or text

smart_formatbooleanOptionalDefaults to false

Apply formatting to transcript output. When set to true, additional formatting will be applied to transcripts to improve readability

summarizeenum or booleanOptional

Summarize content. For Listen API, supports string version option. For Read API, accepts boolean only.

tagstring or list of stringsOptional

Label your requests for the purpose of identification during usage reporting

topicsbooleanOptionalDefaults to false

Detect topics throughout a transcript or text

utterancesbooleanOptionalDefaults to false

Segments speech into meaningful semantic units

utt_splitdoubleOptionalDefaults to 0.8

Seconds to wait before detecting a pause between words in submitted audio

versionenum or stringOptional

Version of an AI model to use

Request

Transcribe an audio file

objectRequired

stringRequiredformat: "binary"

Response

Successful transcription

metadataobject

resultsobject

1	import requests
2	# Define the URL for the Deepgram API endpoint url = "https://api.deepgram.com/v1/listen"
3	# Define the headers for the HTTP request headers = {
4	"Authorization": "Token DEEPGRAM_API_KEY",
5	"Content-Type": "audio/*"
6	}
7	# Get the audio file with open("/path/to/youraudio.wav", "rb") as audio_file:
8	# Make the HTTP request
9	response = requests.post(url, headers=headers, data=audio_file)
10
11	print(response.json())

1	{
2	"metadata": {
3	"request_id": "a847f427-4ad5-4d67-9b95-db801e58251c",
4	"sha256": "154e291ecfa8be6ab8343560bcc109008fa7853eb5372533e8efdefc9b504c33",
5	"created": "2024-05-12T18:57:13.426Z",
6	"duration": 25.933313,
7	"channels": 1,
8	"models": [
9	"30089e05-99d1-4376-b32e-c263170674af"
10	],
11	"model_info": {},
12	"summary_info": {
13	"model_uuid": "67875a7f-c9c4-48a0-aa55-5bdb8a91c34a",
14	"input_tokens": 95,
15	"output_tokens": 63
16	},
17	"sentiment_info": {
18	"model_uuid": "80ab3179-d113-4254-bd6b-4a2f96498695",
19	"input_tokens": 105,
20	"output_tokens": 105
21	},
22	"topics_info": {
23	"model_uuid": "80ab3179-d113-4254-bd6b-4a2f96498695",
24	"input_tokens": 105,
25	"output_tokens": 7
26	},
27	"intents_info": {
28	"model_uuid": "80ab3179-d113-4254-bd6b-4a2f96498695",
29	"input_tokens": 105,
30	"output_tokens": 4
31	},
32	"tags": [
33	"test"
34	],
35	"transaction_key": "deprecated"
36	},
37	"results": {
38	"channels": [
39	{
40	"search": [
41	{
42	"query": "string",
43	"hits": [
44	{
45	"confidence": 1.1,
46	"start": 1.1,
47	"end": 1.1,
48	"snippet": "string"
49	}
50	]
51	}
52	],
53	"alternatives": [
54	{
55	"transcript": "string",
56	"confidence": 1.1,
57	"words": [
58	{
59	"word": "string",
60	"start": 1.1,
61	"end": 1.1,
62	"confidence": 1.1
63	}
64	],
65	"paragraphs": {
66	"transcript": "string",
67	"paragraphs": [
68	{
69	"sentences": [
70	{
71	"text": "string",
72	"start": 1.1,
73	"end": 1.1
74	}
75	],
76	"speaker": 1,
77	"num_words": 1,
78	"start": 1.1,
79	"end": 1.1
80	}
81	]
82	},
83	"summaries": [
84	{
85	"summary": "string",
86	"start_word": 1,
87	"end_word": 1
88	}
89	],
90	"topics": [
91	{
92	"text": "string",
93	"start_word": 1,
94	"end_word": 1,
95	"topics": [
96	"string"
97	]
98	}
99	]
100	}
101	],
102	"detected_language": "string"
103	}
104	],
105	"utterances": [
106	{
107	"start": 1.1,
108	"end": 1.1,
109	"confidence": 1.1,
110	"channel": 1,
111	"transcript": "string",
112	"words": [
113	{
114	"word": "string",
115	"start": 1.1,
116	"end": 1.1,
117	"confidence": 1.1,
118	"speaker": 1,
119	"speaker_confidence": 1,
120	"punctuated_word": "string"
121	}
122	],
123	"speaker": 1,
124	"id": "string"
125	}
126	],
127	"summary": {
128	"result": "success",
129	"short": "Speaker 0 discusses the significance of the first all-female spacewalk with an all-female team, stating that it is a tribute to the skilled and qualified women who were denied opportunities in the past."
130	},
131	"topics": {
132	"results": {
133	"topics": {
134	"segments": [
135	{
136	"text": "And, um, I think if it signifies anything, it is, uh, to honor the the women who came before us who, um, were skilled and qualified, um, and didn't get the the same opportunities that we have today.",
137	"start_word": 32,
138	"end_word": 69,
139	"topics": [
140	{
141	"topic": "Spacewalk",
142	"confidence_score": 0.91581345
143	}
144	]
145	}
146	]
147	}
148	}
149	},
150	"intents": {
151	"results": {
152	"intents": {
153	"segments": [
154	{
155	"text": "If you found this valuable, you can subscribe to the show on spotify or your favorite podcast app.",
156	"start_word": 354,
157	"end_word": 414,
158	"intents": [
159	{
160	"intent": "Encourage podcasting",
161	"confidence_score": 0.0038975573
162	}
163	]
164	}
165	]
166	}
167	}
168	},
169	"sentiments": {
170	"segments": [
171	{
172	"text": "Yeah. As as much as, um, it's worth celebrating, uh, the first, uh, spacewalk, um, with an all-female team, I think many of us are looking forward to it just being normal. And, um, I think if it signifies anything, it is, uh, to honor the the women who came before us who, um, were skilled and qualified, um, and didn't get the the same opportunities that we have today.",
173	"start_word": 0,
174	"end_word": 69,
175	"sentiment": "positive",
176	"sentiment_score": 0.5810546875
177	}
178	],
179	"average": {
180	"sentiment": "positive",
181	"sentiment_score": 0.5810185185185185
182	}
183	}
184	}
185	}

Headers

Query parameters

Request

Response

Errors