Pre-Recorded Audio

POST

https://api.deepgram.com/v1/listen

POST

/v1/listen

1 import requests
2 
3 url = "https://api.deepgram.com/v1/listen"
4 
5 payload = { "url": "https://dpgr.am/spacewalk.wav" }
6 headers = {
7     "Authorization": "Token <apiKey>",
8     "Content-Type": "application/json"
9 }
10 
11 response = requests.post(url, json=payload, headers=headers)
12 
13 print(response.json())

Try it

1 {
2   "metadata": {
3     "transaction_key": "deprecated",
4     "request_id": "a847f427-4ad5-4d67-9b95-db801e58251c",
5     "sha256": "154e291ecfa8be6ab8343560bcc109008fa7853eb5372533e8efdefc9b504c33",
6     "created": "2024-05-12T18:57:13.426Z",
7     "duration": 25.933313,
8     "channels": 1,
9     "models": [
10       "30089e05-99d1-4376-b32e-c263170674af"
11     ],
12     "model_info": {
13       "30089e05-99d1-4376-b32e-c263170674af": {
14         "name": "2-general-nova",
15         "version": "2024-01-09.29447",
16         "arch": "nova-2"
17       }
18     },
19     "summary_info": {
20       "model_uuid": "67875a7f-c9c4-48a0-aa55-5bdb8a91c34a",
21       "input_tokens": 95,
22       "output_tokens": 63
23     },
24     "sentiment_info": {
25       "model_uuid": "80ab3179-d113-4254-bd6b-4a2f96498695",
26       "input_tokens": 105,
27       "output_tokens": 105
28     },
29     "topics_info": {
30       "model_uuid": "80ab3179-d113-4254-bd6b-4a2f96498695",
31       "input_tokens": 105,
32       "output_tokens": 7
33     },
34     "intents_info": {
35       "model_uuid": "80ab3179-d113-4254-bd6b-4a2f96498695",
36       "input_tokens": 105,
37       "output_tokens": 4
38     },
39     "tags": [
40       "test"
41     ]
42   },
43   "results": {
44     "channels": [
45       {
46         "search": [
47           {
48             "query": "spacewalk",
49             "hits": [
50               {
51                 "confidence": 0.98,
52                 "start": 5.2,
53                 "end": 5.8,
54                 "snippet": "the first all-female spacewalk"
55               }
56             ]
57           }
58         ],
59         "alternatives": [
60           {
61             "transcript": "This historic spacewalk marks a significant milestone for women in space exploration.",
62             "confidence": 0.95,
63             "words": [
64               {
65                 "word": "This",
66                 "start": 0,
67                 "end": 0.3,
68                 "confidence": 0.98
69               },
70               {
71                 "word": "historic",
72                 "start": 0.3,
73                 "end": 0.7,
74                 "confidence": 0.97
75               },
76               {
77                 "word": "spacewalk",
78                 "start": 5.2,
79                 "end": 5.8,
80                 "confidence": 0.99
81               }
82             ],
83             "paragraphs": {
84               "transcript": "This historic spacewalk marks a significant milestone for women in space exploration.",
85               "paragraphs": [
86                 {
87                   "sentences": [
88                     {
89                       "text": "This historic spacewalk marks a significant milestone for women in space exploration.",
90                       "start": 0,
91                       "end": 6
92                     }
93                   ],
94                   "speaker": 1,
95                   "num_words": 12,
96                   "start": 0,
97                   "end": 6
98                 }
99               ]
100             },
101             "entities": [
102               {
103                 "label": "Event",
104                 "value": "spacewalk",
105                 "raw_value": "spacewalk",
106                 "confidence": 0.95,
107                 "start_word": 2,
108                 "end_word": 3
109               }
110             ],
111             "summaries": [
112               {
113                 "summary": "The transcript highlights the importance of the first all-female spacewalk.",
114                 "start_word": 0,
115                 "end_word": 12
116               }
117             ],
118             "topics": [
119               {
120                 "text": "This historic spacewalk marks a significant milestone for women in space exploration.",
121                 "start_word": 0,
122                 "end_word": 12,
123                 "topics": [
124                   "Space Exploration"
125                 ]
126               }
127             ]
128           }
129         ],
130         "detected_language": "en"
131       }
132     ],
133     "utterances": [
134       {
135         "start": 0,
136         "end": 6,
137         "confidence": 0.95,
138         "channel": 1,
139         "transcript": "This historic spacewalk marks a significant milestone for women in space exploration.",
140         "words": [
141           {
142             "word": "This",
143             "start": 0,
144             "end": 0.3,
145             "confidence": 0.98,
146             "speaker": 1,
147             "speaker_confidence": 0.99,
148             "punctuated_word": "This"
149           },
150           {
151             "word": "historic",
152             "start": 0.3,
153             "end": 0.7,
154             "confidence": 0.97,
155             "speaker": 1,
156             "speaker_confidence": 0.99,
157             "punctuated_word": "historic"
158           },
159           {
160             "word": "spacewalk",
161             "start": 5.2,
162             "end": 5.8,
163             "confidence": 0.99,
164             "speaker": 1,
165             "speaker_confidence": 0.99,
166             "punctuated_word": "spacewalk."
167           }
168         ],
169         "speaker": 1,
170         "id": "utt-001"
171       }
172     ],
173     "summary": {
174       "result": "success",
175       "short": "Speaker 1 highlights the historic significance of the first all-female spacewalk as a milestone for women in space exploration."
176     },
177     "topics": {
178       "results": {
179         "topics": {
180           "segments": [
181             {
182               "text": "And, um, I think if it signifies anything, it is, uh, to honor the the women who came before us who, um, were skilled and qualified, um, and didn't get the the same opportunities that we have today.",
183               "start_word": 32,
184               "end_word": 69,
185               "topics": [
186                 {
187                   "topic": "Spacewalk",
188                   "confidence_score": 0.91581345
189                 }
190               ]
191             }
192           ]
193         }
194       }
195     },
196     "intents": {
197       "results": {
198         "intents": {
199           "segments": [
200             {
201               "text": "If you found this valuable, you can subscribe to the show on spotify or your favorite podcast app.",
202               "start_word": 354,
203               "end_word": 414,
204               "intents": [
205                 {
206                   "intent": "Encourage podcasting",
207                   "confidence_score": 0.0038975573
208                 }
209               ]
210             }
211           ]
212         }
213       }
214     },
215     "sentiments": {
216       "segments": [
217         {
218           "text": "Yeah. As as much as, um, it's worth celebrating, uh, the first, uh, spacewalk, um, with an all-female team, I think many of us are looking forward to it just being normal. And, um, I think if it_signf",
219           "start_word": 0,
220           "end_word": 69,
221           "sentiment": "positive",
222           "sentiment_score": 0.5810546875
223         }
224       ],
225       "average": {
226         "sentiment": "positive",
227         "sentiment_score": 0.5810185185185185
228       }
229     }
230   }
231 }

Transcribe audio and video using Deepgram’s speech-to-text REST API

Authentication

AuthorizationToken

Use Authorization: Token <API_KEY> Example: Authorization: Token 12345abcdef

AuthorizationBearer

Use Authorization: Bearer <JWT> Example: Authorization: Bearer eyJhbGciOiJ...

Query parameters

callbackstringOptional

URL to which we'll make the callback request

callback_methodenumOptionalDefaults to POST

HTTP method by which the callback request will be made

Allowed values:

extrastring or list of stringsOptional

Arbitrary key-value pairs that are attached to the API response for usage in downstream processing

sentimentbooleanOptionalDefaults to false

Recognizes the sentiment throughout a transcript or text

summarizeenum or booleanOptional

Summarize content. For Listen API, supports string version option. For Read API, accepts boolean only.

tagstring or list of stringsOptional

Label your requests for the purpose of identification during usage reporting

topicsbooleanOptionalDefaults to false

Detect topics throughout a transcript or text

custom_topicstring or list of stringsOptional

Custom topics you want the model to detect within your input audio or text if present Submit up to 100.

custom_topic_modeenumOptionalDefaults to extended

Sets how the model will interpret strings submitted to the custom_topic param. When strict, the model will only return topics submitted using the custom_topic param. When extended, the model will return its own detected topics in addition to those submitted using the custom_topic param

Allowed values:

intentsbooleanOptionalDefaults to false

Recognizes speaker intent throughout a transcript or text

custom_intentstring or list of stringsOptional

Custom intents you want the model to detect within your input audio if present

custom_intent_modeenumOptionalDefaults to extended

Sets how the model will interpret intents submitted to the custom_intent param. When strict, the model will only return intents submitted using the custom_intent param. When extended, the model will return its own detected intents in the custom_intent param.

Allowed values:

detect_entitiesbooleanOptionalDefaults to false

Identifies and extracts key entities from content in submitted audio

detect_languageboolean or list of stringsOptional

Identifies the dominant language spoken in submitted audio

diarizebooleanOptionalDefaults to falseDeprecated

Deprecated: use diarize_model instead. Recognize speaker changes. Each word in the transcript will be assigned a speaker number starting at 0.

diarize_modelenumOptional

Select and enable a specific diarization model version. Specifying this parameter enables diarization and selects the model — you do not need to also set the deprecated diarize=true parameter. For batch, supported values are latest (currently v2), v1, and v2. For streaming, supported values are latest (currently v1) and v1; v2 returns a validation error on streaming requests.

Select and enable a specific diarization model version. Specifying this parameter enables diarization and selects the model — you do not need to also set the deprecated `diarize=true` parameter. For batch, supported values are `latest` (currently v2), `v1`, and `v2`. For streaming, supported values are `latest` (currently v1) and `v1`; `v2` returns a validation error on streaming requests.

Allowed values:

dictationbooleanOptionalDefaults to false

Dictation mode for controlling formatting with dictated speech

encodingenumOptional

Specify the expected encoding of your submitted audio

filler_wordsbooleanOptionalDefaults to false

Filler Words can help transcribe interruptions in your audio, like "uh" and "um"

keytermlist of stringsOptional

Key term prompting can boost or suppress specialized terminology and brands. Only compatible with Nova-3

keywordsstring or list of stringsOptional

Keywords can boost or suppress specialized terminology and brands

languagestringOptionalDefaults to en

The BCP-47 language tag that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available

The [BCP-47 language tag](https://tools.ietf.org/html/bcp47) that hints at the primary spoken language. Depending on the Model and API endpoint you choose only certain languages are available

measurementsbooleanOptionalDefaults to false

Spoken measurements will be converted to their corresponding abbreviations

modelenum or stringOptionalDefaults to base-general

AI model used to process submitted audio

multichannelbooleanOptionalDefaults to false

Transcribe each audio channel independently

numeralsbooleanOptionalDefaults to false

Numerals converts numbers from written format to numerical format

paragraphsbooleanOptionalDefaults to false

Splits audio into paragraphs to improve transcript readability

profanity_filterbooleanOptionalDefaults to false

Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely

punctuatebooleanOptionalDefaults to false

Add punctuation and capitalization to the transcript

redactstring or list of enumsOptionalDefaults to false

Redaction removes sensitive information from your transcripts

replacestring or list of stringsOptional

Search for terms or phrases in submitted audio and replaces them

searchstring or list of stringsOptional

Search for terms or phrases in submitted audio

smart_formatbooleanOptionalDefaults to false

Apply formatting to transcript output. When set to true, additional formatting will be applied to transcripts to improve readability

utterancesbooleanOptionalDefaults to false

Segments speech into meaningful semantic units

utt_splitdoubleOptionalDefaults to 0.8

Seconds to wait before detecting a pause between words in submitted audio

versionenum or stringOptionalDefaults to latest

Version of an AI model to use

mip_opt_outbooleanOptionalDefaults to false

Opts out requests from the Deepgram Model Improvement Program. Refer to our Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip

Request

Transcribe an audio or video file

urlstringRequiredformat: "uri"

Response

Returns either transcription results, or a request_id when using a callback.

ListenV1Responseobject

The standard transcription response

ListenV1AcceptedResponseobject

Accepted response for asynchronous transcription requests

Errors

400

Bad Request Error