API Reference
Deepgram's API gives you streamlined access to Deepgram products, including automatic transcription from Deepgram's off-the-shelf and trained speech recognition models, and to all of your Deepgram resources, such as your account's projects, API keys, billing settings and balances, and usage statistics.
Our SDKs make integrating and interacting with our API quick and easy.
Endpoint
https://api.deepgram.com/v1
Authentication
Send requests to the API with an
Authorization
header that references your project's API Key:
Authorization: Token <YOUR_DEEPGRAM_API_KEY>
You can create a Deepgram API Key in the Deepgram Console. You must create your first API Key using the Console.
All API requests must be made over HTTPS. Calls made over plain HTTP will fail. API requests made without authentication will also fail.
Security Scheme Type | API Key |
---|---|
Header parameter name: | Authorization |
Transcription
High-speed transcription of either pre-recorded or streaming audio. This feature is very fast, can understand nearly every audio format available, and is customizable. You can customize your transcript using various query parameters and apply general purpose and custom-trained AI models.
Deepgram supports over 100 different audio formats and encodings. For example, some of the most common audio formats and encodings we support include MP3, MP4, MP2, AAC, WAV, FLAC, PCM, M4A, Ogg, Opus, and WebM. However, because audio format is largely unconstrained, we always recommend to ensure compatibility by testing small sets of audio when first operating with new audio sources.
Transcribe Pre-recorded Audio
Transcribes the specified audio file.
Query Parameters
Level of model you would like to use in your request. Options include:
- enhanced:
Applies our newest, most powerful ASR models; they generally have higher accuracy and better word recognition than our Base models, and they handle uncommon words significantly better. - base: (Default)
Applies our Base models, which are built on our signature end-to-end deep learning speech model architecture and offer a solid combination of accuracy and cost effectiveness.
To learn more, see Features: Tier.
AI model used to process submitted audio. Options include:
- general: (Default)
Optimized for everyday audio processing.
TIERS: enhanced, base - meeting:
Optimized for conference room settings, which include multiple speakers with a single microphone.
TIERS: enhanced beta, base - phonecall:
Optimized for low-bandwidth audio phone calls.
TIERS: enhanced beta, base - voicemail:
Optimized for low-bandwidth audio clips with a single speaker. Derived from thephonecall
model.
TIERS: base - finance:
Optimized for multiple speakers with varying audio quality, such as might be found on a typical earnings call. Vocabulary is heavily finance oriented.
TIERS: base - conversationalai:
Optimized to allow artificial intelligence technologies, such as chatbots, to interact with people in a human-like way.
TIERS: base - video:
Optimized for audio sourced from videos.
TIERS: base - <custom_id>:
To use a custom model associated with your account, include itscustom_id
.
TIERS: enhanced, base (depending on which tier the custom model was trained on)
To learn more, see Features: Model.
Version of the model to use.
Default: latest
Possible values: latest
OR <version_id>
To learn more, see Features: Version.
BCP-47 language tag that hints at the primary spoken language. Language support is optimized for the following language/model combinations:
Chinese
- zh-CN: China (Simplified Mandarin) beta
MODELS: general - zh-TW: Taiwan (Traditional Mandarin) beta
MODELS: general
Dutch
- nl: beta
MODELS: general
English
- en: English (Default)
MODELS: general (enhanced, base), meeting (enhanced beta, base), phonecall (enhanced beta, base), voicemail, finance, conversationalai, video - en-AU: Australia
MODELS: general - en-IN: India
MODELS: general - en-NZ: New Zealand
MODELS: general - en-GB: United Kingdom
MODELS: general - en-US: United States
MODELS: general (enhanced, base), meeting (enhanced beta, base), phonecall (enhanced beta, base), voicemail, finance, conversationalai, video
Flemish
- nl: beta
MODELS: general
French
- fr:
MODELS: general - fr-CA: Canada
MODELS: general
German
- de:
MODELS: general
Hindi
- hi:
MODELS: general - hi-Latn: Roman Script beta
MODELS: general
Indonesian
- id: beta
MODELS: general
Italian
- it: beta
MODELS: general
Japanese
- ja: beta
MODELS: general
Korean
- ko: beta
MODELS: general
Portuguese
- pt:
MODELS: general - pt-BR: Brazil
MODELS: general - pt-PT: Portugal
MODELS: general
Russian
- ru:
MODELS: general
Spanish
- es:
MODELS: general (enhanced beta, base) - es-419: Latin America
MODELS: general
Swedish
- sv: beta
MODELS: general
Turkish
- tr:
MODELS: general
Ukrainian
- uk: beta
MODELS: general
To learn more, see Features: Language.
Indicates whether to add punctuation and capitalization to the transcript. To learn more, see Features: Punctuation.
Indicates whether to remove profanity from the transcript. To learn more, see Features: Profanity Filter.
Indicates whether to redact sensitive information, replacing redacted content with asterisks (*). Options include:
- pci:
Redacts sensitive credit card information, including credit card number, expiration date, and CVV. - numbers: (or true)
Aggressively redacts strings of numerals. - ssn: beta
Redacts social security numbers.
Can send multiple instances in query string (for example,
redact=pci&redact=numbers
). When sending multiple values, redaction occurs in the
order you specify. For instance, in this example, sensitive credit card information would be redacted
first, then strings
of numbers.
To learn more, see Features: Redaction.
Indicates whether to recognize speaker changes. When set to
true
, each word in the transcript will be assigned a speaker number starting at 0. To learn more, see
Features: Diarization.
Indicates whether to recognize alphanumeric strings. When set to
true
, whitespace will be removed between characters identified as part of an alphanumeric
string. To learn more, see
Features: Named-Entity Recognition (NER).
Indicates whether to transcribe each audio channel independently. When set to
true
, you will receive one transcript for each channel, which means you can apply a
different model to each channel using the model parameter (e.g., set
model
to
general:phonecall
, which applies the
general
model to channel 0 and the
phonecall
model to channel 1).
To learn more, see Features: Multichannel.
Maximum number of transcript alternatives to return. Just like a human listener, Deepgram can provide multiple possible interpretations of what it hears.
Default: 1
Indicates whether to convert numbers from written format (e.g., one) to numerical format (e.g., 1).
Deepgram can format numbers up to 999,999.
Converted numbers do not include punctuation. For example, 999,999 would be transcribed as
999999
.
To learn more, see Features: Numerals.
Terms or phrases to search for in the submitted audio. Deepgram searches for acoustic patterns in audio rather than text patterns in transcripts because we have noticed that acoustic pattern matching is more performant.
- Can include up to 25 search terms per request.
- Can send multiple instances in query string (for example,
search=speech&search=Friday
).
To learn more, see Features: Search.
Terms or phrases to search for in the submitted audio and replace.
- URL-encode any terms or phrases that include spaces, punctuation, or other special characters.
- Can send multiple instances in query string (for example,
replace=this:that&replace=thisalso:thatalso
). - Replacing a term or phrase with nothing (
replace=this
) will remove the term or phrase from the audio transcript.
To learn more, see Features: Replace.
Callback URL to provide if you would like your submitted audio to be processed asynchronously. When
passed, Deepgram will immediately respond with a
request_id
. When it has finished analyzing the audio, it will send a POST request to the
provided URL with an appropriate HTTP status code.
Notes:
- You may embed basic authentication credentials in the callback URL.
- Only ports 80, 443, 8080, and 8443 can be used for callbacks
To learn more, see Features: Callback.
Keywords to which the model should pay particular attention to boosting or suppressing to help it understand context. Just like a human listener, Deepgram can better understand mumbled, distorted, or otherwise hard-to-decipher speech when it knows the context of the conversation.
Notes:
- Can include up to 200 keywords per request.
- Can send multiple instances in query string (for example,
keywords=medicine&keywords=prescription
). - Can request multi-word keywords in a percent-encoded query string (for example,
keywords=miracle%20medicine
). When Deepgram listens for your supplied keywords, it separates them into individual words, then boosts or suppresses them individually. - Can append a positive or negative intensifier to either boost or suppress the recognition of particular words. Positive and negative values can be decimals.
- Follow best practices for keyword boosting.
- Support for out-of-vocabulary (OOV) keyword boosting when processing streaming audio is
currently in
beta; to fall back to previous keyword behavior, append the query parameter
keyword_boost=legacy
to your API request.
To learn more, see Features: Keywords.
Indicates whether Deepgram will segment speech into meaningful semantic units, which allows the model
to interact more naturally and effectively with speakers' spontaneous speech patterns. For example,
when humans speak
to each other conversationally, they often pause mid-sentence to reformulate their thoughts, or stop
and restart a badly-worded sentence. When
utterances
is set to
true
, these utterances are identified and returned in the transcript results.
By default, when utterances is enabled, it starts a new utterance after 0.8 s of silence. You can
customize the length of time used to determine where to split utterances by submitting the
utt_split
parameter.
To learn more, see Features: Utterances.
Length of time in seconds of silence between words that Deepgram will use when determining where to split utterances. Used when utterances is enabled.
Default: 0.8
To learn more, see Features: Utterance Split.
Tag to associate with the request. Your request will automatically be associated with any tags you add to the API Key used to run the request. Tags associated with requests appear in usage reports.
To learn more, see Features: Tag.
Request Body Schema
Request body when submitting pre-recorded audio. Accepts either:
- raw binary audio data. In this case, include a
Content-Type
header set to the audio MIME type. - JSON object with a single field from which the audio can be retrieved. In this case, include a
Content-Type
header set toapplication/json
.
URL of audio file to transcribe.
Responses
Status | Description |
---|---|
200 Success | Audio submitted for transcription. |
Response Schema
metadata: object
JSON-formatted ListenMetadata object.
request_id: uuid
Unique identifier of the submitted audio and derived data returned.
transaction_key: string
Blob of text that helps Deepgram engineers debug any problems you encounter. If you need help getting an API call to work correctly, send this key to us so that we can use it as a starting point when investigating any issues.
sha256: string
SHA-256 hash of the submitted audio data.
created: string
ISO-8601 timestamp that indicates when the audio was submitted.
duration: number
Duration in seconds of the submitted audio.
channels: integer
Number of channels detected in the submitted audio.
results: object
JSON-formatted ListenResults object.
channels: array
Array of JSON-formatted ChannelResult objects.
search: array
Array of JSON-formatted
SearchResults
.query: string
Term for which Deepgram is searching.
hits: array
Array of JSON-formatted Hit objects.
confidence: number
Value between 0 and 1 that indicates the model's relative confidence in this hit.
start: number
Offset in seconds from the start of the audio to where the hit occurs.
end: number
Offset in seconds from the start of the audio to where the hit ends.
snippet: string
Transcript that corresponds to the time between start and end.
alternatives: array
Array of JSON-formatted
ResultAlternative
objects. This array will have length n, where n matches the value of thealternatives
parameter passed in the request body.transcript: string
Single-string transcript containing what the model hears in this channel of audio.
confidence: number
Value between 0 and 1 indicating the model's relative confidence in this transcript.
words: array
Array of JSON-formatted Word objects.
word: string
Distinct word heard by the model.
start: number
Offset in seconds from the start of the audio to where the spoken word starts.
end: number
Offset in seconds from the start of the audio to where the spoken word ends.
confidence: number
Value between 0 and 1 indicating the model's relative confidence in this word.
Transcribe Streaming Audio
Deepgram provides its customers with real-time, streaming transcription via its streaming endpoints. These endpoints are high-performance, full-duplex services running over the tried-and-true WebSocket protocol, which makes integration with customer pipelines simple due to the wide array of client libraries available.
To use this endpoint, connect to wss://api.deepgram.com/v1/listen
. TLS encryption will protect
your connection and data. We support a minimum of TLS 1.2.
All data is sent to the streaming endpoint as binary-type WebSocket messages containing payloads that are the raw audio data. Because the protocol is full-duplex, you can stream in real-time and still receive transcription responses while uploading data.
When you are finished, send an empty (length zero) binary message to the server. The server will interpret it as a shutdown command, which means it will finish processing whatever data is still has cached, send the response to the client, send a summary metadata object, and then terminate the WebSocket connection.
To learn more about working with real-time streaming data and results, see Get Started with Streaming Audio.
Query Parameters
Level of model you would like to use in your request. Options include:
- enhanced:
Applies our newest, most powerful ASR models; they generally have higher accuracy and better word recognition than our Base models, and they handle uncommon words significantly better. - base: (Default)
Applies our Base models, which are built on our signature end-to-end deep learning speech model architecture and offer a solid combination of accuracy and cost effectiveness.
To learn more, see Features: Tier.
AI model used to process submitted audio. Options include:
- general: (Default)
Optimized for everyday audio processing.
TIERS: enhanced, base - meeting:
Optimized for conference room settings, which include multiple speakers with a single microphone.
TIERS: enhanced beta, base - phonecall:
Optimized for low-bandwidth audio phone calls.
TIERS: enhanced beta, base - voicemail:
Optimized for low-bandwidth audio clips with a single speaker. Derived from thephonecall
model.
TIERS: base - finance:
Optimized for multiple speakers with varying audio quality, such as might be found on a typical earnings call. Vocabulary is heavily finance oriented.
TIERS: base - conversationalai:
Optimized to allow artificial intelligence technologies, such as chatbots, to interact with people in a human-like way.
TIERS: base - video:
Optimized for audio sourced from videos.
TIERS: base - <custom_id>:
To use a custom model associated with your account, include itscustom_id
.
TIERS: enhanced, base (depending on which tier the custom model was trained on)
To learn more, see Features: Model.
Version of the model to use.
Default: latest
Possible values: latest
OR <version_id>
To learn more, see Features: Version.
BCP-47 language tag that hints at the primary spoken language. Language support is optimized for the following language/model combinations:
Chinese
- zh-CN: China (Simplified Mandarin) beta
MODELS: general - zh-TW: Taiwan (Traditional Mandarin) beta
MODELS: general
Dutch
- nl: beta
MODELS: general
English
- en: English (Default)
MODELS:general (enhanced, base), meeting (enhanced beta, base), phonecall (enhanced beta, base), voicemail, finance, conversationalai, video - en-AU: Australia
MODELS: general - en-IN: India
MODELS: general - en-NZ: New Zealand
MODELS: general - en-GB: United Kingdom
MODELS: general - en-US: United States
MODELS: general (enhanced, base), meeting (enhanced beta, base), phonecall (enhanced beta, base), voicemail, finance, conversationalai, video
Flemish
- nl: beta
MODELS: general
French
- fr:
MODELS: general - fr-CA: Canada
MODELS: general
German
- de:
MODELS: general
Hindi
- hi:
MODELS: general - hi-Latn: Roman Script beta
MODELS: general
Indonesian
- id: beta
MODELS: general
Italian
- it: beta
MODELS: general
Japanese
- ja: beta
MODELS: general
Korean
- ko: beta
MODELS: general
Portuguese
- pt:
MODELS: general - pt-BR: Brazil
MODELS: general - pt-PT: Portugal
MODELS: general
Russian
- ru:
MODELS: general
Spanish
- es:
MODELS: general (enhanced beta, base) - es-419: Latin America
MODELS: general
Swedish
- sv: beta
MODELS: general
Turkish
- tr:
MODELS: general
Ukrainian
- uk: beta
MODELS: general
To learn more, see Features: Language.
Indicates whether to add punctuation and capitalization to the transcript. To learn more, see Features: Punctuation.
Indicates whether to remove profanity from the transcript. To learn more, see Features: Profanity Filter.
Indicates whether to redact sensitive information, replacing redacted content with asterisks (*). Options include:
pci:
Redacts sensitive credit card information, including credit card number, expiration date, and CVV.
numbers: (or true)
Aggressively redacts strings of numerals.
ssn: beta
Redacts social security numbers.
Can send multiple instances in query string (for example,
redact=pci&redact=numbers
). When sending multiple values, redaction occurs in the
order you specify. For instance, in this example, sensitive credit card information would be redacted
first, then strings
of numbers.
To learn more, see Features: Redaction.
Indicates whether to recognize speaker changes. When set to
true
, each word in the transcript will be assigned a speaker number starting at 0. To learn more, see
Features: Diarization.
Indicates whether to recognize alphanumeric strings. When set to
true
, whitespace will be removed between characters identified as part of an alphanumeric
string. To learn more, see
Features: Named-Entity Recognition (NER).
Indicates whether to transcribe each audio channel independently. When set to
true
, you will receive one transcript for each channel, which means you can apply a
different model to each channel using the model parameter (e.g., set
model
to
general:phonecall
, which applies the
general
model to channel 0 and the
phonecall
model to channel 1).
To learn more, see Features: Multichannel.
Maximum number of transcript alternatives to return. Just like a human listener, Deepgram can provide multiple possible interpretations of what it hears.
Default: 1
Indicates whether to convert numbers from written format (e.g., one) to numerical format (e.g., 1).
Deepgram can format numbers up to 999,999.
Converted numbers do not include punctuation. For example, 999,999 would be transcribed as
999999
.
To learn more, see Features: Numerals.
Terms or phrases to search for in the submitted audio. Deepgram searches for acoustic patterns in audio rather than text patterns in transcripts because we have noticed that acoustic pattern matching is more performant.
- Can include up to 25 search terms per request.
- Can send multiple instances in query string (for example,
search=speech&search=Friday
).
To learn more, see Features: Search.
Terms or phrases to search for in the submitted audio and replace.
- URL-encode any terms or phrases that include spaces, punctuation, or other special characters.
- Can send multiple instances in query string (for example,
replace=this:that&replace=thisalso:thatalso
). - Replacing a term or phrase with nothing (
replace=this
) will remove the term or phrase from the audio transcript.
To learn more, see Features: Replace.
Callback URL to provide if you would like your submitted audio to be processed asynchronously. When
passed, Deepgram will immediately respond with a
request_id
. When it has finished analyzing the audio, it will send a POST request to the
provided URL with an appropriate HTTP status code.
Notes:
- You may embed basic authentication credentials in the callback URL.
- Only ports 80, 443, 8080, and 8443 can be used for callbacks
For streaming audio,
callback
can be used to redirect streaming responses to a different server:
- If the callback URL begins with
http://
orhttps://
, then POST requests are sent to the callback server for each streaming response. - If the callback URL begins with
ws://
orwss://
, then a WebSocket connection is established with the callback server and WebSocket text messages are sent containing the streaming responses. - If a WebSocket callback connection is disconnected at any point, the entire real-time transcription stream is killed; this maintains the strong guarantee of a one-to-one relationship between incoming real-time connections and outgoing WebSocket callback connections.
To learn more, see Features: Callback.
Keywords to which the model should pay particular attention to boosting or suppressing to help it understand context. Just like a human listener, Deepgram can better understand mumbled, distorted, or otherwise hard-to-decipher speech when it knows the context of the conversation.
Notes:
- Can include up to 200 keywords per request.
- Can send multiple instances in query string (for example,
keywords=medicine&keywords=prescription
). - Can request multi-word keywords in a percent-encoded query string (for example,
keywords=miracle%20medicine
). When Deepgram listens for your supplied keywords, it separates them into individual words, then boosts or suppresses them individually. - Can append a positive or negative intensifier to either boost or suppress the recognition of particular words. Positive and negative values can be decimals.
- Follow best practices for keyword boosting.
- Support for out-of-vocabulary (OOV) keyword boosting when processing streaming audio is
currently in
beta; to fall back to previous keyword behavior, append the query parameter
keyword_boost=legacy
to your API request.
To learn more, see Features: Keywords.
Indicates whether the streaming endpoint should send you updates to its transcription as more audio
becomes
available. When set to true
, the streaming endpoint returns regular updates, which means transcription
results
will
likely change for a period of time. By default, this flag is set to false
.
When the flag is set to false
, latency increases (usually by several seconds) because the
server needs to stabilize
the transcription before returning the final results for each piece of incoming audio. If you want
the
lowest-latency
streaming available, then set interim_results
to true
and handle the
corrected transcripts as they are returned.
To learn more, see Features: Interim Results.
Indicates whether Deepgram will detect whether a speaker has finished speaking (or paused for a
significant period of
time, indicating the completion of an idea). When Deepgram detects an endpoint, it assumes that no
additional data
will improve its prediction, so it immediately finalizes the result for the processed time range and
returns the
transcript with a speech_final
parameter set to true
.
For example, if you are working with a 15-second audio clip, but someone is speaking for only the first 3 seconds, endpointing allows you to get a finalized result after the first 3 seconds.
By default, endpointing is enabled and finalizes a transcript after 10 ms of silence. You can
customize
the length
of time used to detect whether a speaker has finished speaking by submitting the
vad_turnoff
parameter.
Default: true
To learn more, see Features: Endpointing.
Length of time in milliseconds of silence that voice activation detection (VAD) will use to detect that a speaker has finished speaking. Used when endpointing is enabled. Defaults to 10 ms. Deepgram customers may configure a value between 10 ms and 5000 ms; on-premise customers may remove this restriction.
Default: 10
To learn more, see Features: Voice Activity Detection (VAD).
Expected encoding of the submitted streaming audio.
Options include:
linear16
: 16-bit, little endian, signed PCM WAV dataflac
: FLAC-encoded datamulaw
: mu-law encoded WAV dataamr-nb
: adaptive multi-rate narrowband codec (sample rate must be 8000)amr-wb
: adaptive multi-rate wideband codec (sample rate must be 16000)opus
: Ogg Opusspeex
: Ogg Speex
Only required when raw, headerless audio packets are sent to the streaming service. For
pre-recorded audio or audio submitted to the standard /listen
endpoint, we support over
40 popular codecs and do not require this parameter.
To learn more, see Features: Encoding.
Number of independent audio channels contained in submitted streaming audio. Only read when a value
is
provided for encoding
.
Default: 1
To learn more, see Features: Channels.
Sample rate of submitted streaming audio. Required (and only read) when a value is provided for
encoding
.
To learn more, see Features: Sample Rate.
Tag to associate with the request. Your request will automatically be associated with any tags you add to the API Key used to run the request. Tags associated with requests appear in usage reports.
To learn more, see Features: Tag.
Responses
Status | Description |
---|---|
201 Success | Audio submitted for transcription. |
Response Schema
channel_index: array
Information about the active channel in the form
[channel_index, total_number_of_channels]
.duration: number
Duration in seconds.
start: number
Offset in seconds.
is_final: boolean
Indicates that Deepgram has identified a point at which its transcript has reached maximum accuracy and is sending a definitive transcript of all audio up to that point. To learn more, see Features: Interim Results.
speech_final: boolean
Indicates that Deepgram has detected an endpoint and immediately finalized its results for the processed time range. To learn more, see Features: Endpointing.
channel: object
alternatives: array
Array of JSON-formatted
ResultAlternative
objects. This array will have length n, where n matches the value of thealternatives
parameter passed in the request body.transcript: string
Single-string transcript containing what the model hears in this channel of audio.
confidence: number
Value between 0 and 1 indicating the model's relative confidence in this transcript.
words: array
Array of JSON-formatted Word objects.
word: string
Distinct word heard by the model.
start: number
Offset in seconds from the start of the audio to where the spoken word starts.
end: number
Offset in seconds from the start of the audio to where the spoken word ends.
confidence: number
Value between 0 and 1 indicating the model's relative confidence in this word.
metadata: object
request_id: uuid
Unique identifier of the submitted audio and derived data returned.
Error Handling
If Deepgram encounters an error during real-time streaming, we will return a WebSocket Close frame (WebSocket Protocol specification, section 5.5.1]).
The body of the Close frame will indicate the reason for closing using one of the specification’s pre-defined status codes followed by a UTF-8-encoded payload that represents the reason for the error. Current codes and payloads in use include:
Code | Payload | Description |
1002 | DATA-0000 | The payload cannot be decoded as audio. It is either not audio data or is a codec unsupported by Deepgram. |
1011 | NET-0000 | The service has not transmitted a Text frame to the client within the timeout window. This may indicate an issue internally in Deepgram's systems or could be due to Deepgram not receiving enough audio data to transcribe a frame. |
1011 | NET-0001 | The service has not received a Binary frame from the client within the timeout window. This may indicate an internal issue in Deepgram's systems, the client's systems, or the network connecting them. |
After sending a Close message, the endpoint considers the WebSocket connection closed and will close the underlying TCP connection.
Projects
A Project organizes all of your Deepgram resources, including your datasets and models, into logical groups. Projects also give you access to API keys and billing settings for Deepgram services.
Get Projects
Retrieves basic information about the specified project.
Required account/project scope(s): project:read
.
Responses
Status | Description |
---|---|
200 Success | Projects found. |
Response Schema
projects: array
Array of project objects.
project_id: uuid
Unique identifier of the project.
name: string
Name of the project.
company: string
Name of the company associated with the project. Optional.
Get Project
Query Parameters
Unique identifier of the project for which you want to retrieve information.
Responses
Status | Description |
---|---|
200 Success | Project found. |
404 Not Found | A project with the specified ID was not found. |
Response Schema
project_id: uuid
Unique identifier of the project.
name: string
Name of the project.
company: string
Name of the company associated with the project. Optional.
Update Project
Updates the specified project.
Required account scope(s): project:write
. Required project scope(s):
project:write:settings
.
Path Parameters
Unique identifier of the project that you want to update.
Request Body Schema
Name of the project. Optional.
Name of the company associated with the project. Optional.
Responses
Status | Description |
---|---|
200 Success | Project updated. |
Response Schema
message: string
Success message.
Delete Project
Deletes the specified project.
Required account scope(s): project:write
. Required project scope(s):
project:write:destroy
.
Path Parameters
Unique identifier of the project that you want to delete.
Responses
Status | Description |
---|---|
200 Success | Project deleted. |
Keys
Keys are associated with Deepgram Projects. They enable you to use the Deepgram API, identify the Project calling the API, and associate usage information with Projects. Keys are assigned Scopes, which determine which actions they can be used to perform in the associated Project. For each Project, you can create multiple Keys and assign different Scopes for each Key.
Get Keys
Retrieves keys for the specified project. If the authenticated account has access to the
members:read
,
admins:read
, and owners:read
project scopes, it will list all keys for the
project. Otherwise, it will only list keys that belong to the authenticated account.
Required account/project scope(s): keys:read
. Optional project scope(s):
members:read
, admins:read
,
owners:read
.
Path Parameters
Unique identifier of the project for which you want to get keys.
Responses
Status | Description |
---|---|
200 Success | Keys found. |
Response Schema
api_keys: array
Array of associated Member and API Key objects.
member: object
Member object.
member_id: uuid
Unique identifier of member.
email: string
Email address of member.
first_name: string
First name of member. If no first name exists, this item will not be returned.
last_name: string
Last name of member. If no last name exists, this item will not be returned.
api_key object
API Key object.
api_key_id: uuid
Unique identifier of API Key.
comment: string
Comments associated with API Key.
scopes: array
Array of scopes associated with API Key.
tags: array
Array of tags associated with API Key. If no tags exist, this item will not be returned.
created: string
Date and time when API Key was created.
expiration_date: string
Date and time when API Key will expire. If no expiration date exists, this item will not be returned.
Get Key
Retrieves basic information about the specified key. If the authenticated account has access to the
members:read
, admins:read
, and owners:read
project scopes, it will
retrieve the information for any key in the project. Otherwise, it will retrieve the information for only
one of the keys that belong to the authenticated account.
Required account/project scope(s): keys:read
. Optional project scope(s):
members:read
, admins:read
, owners:read
.
Path Parameters
Unique identifier of the project for which you want to retrieve information.
Unique identifier of the key that you want to retrieve.
Responses
Status | Description |
---|---|
200 Success | Key found. |
404 Not Found | A key with the specified key ID in the specified project was not found. |
Response Schema
member: object
Member object.
member_id: uuid
Unique identifier of member.
email: string
Email address of member.
first_name: string
First name of member. If no first name exists, this item will not be returned.
last_name: string
Last name of member. If no last name exists, this item will not be returned.
api_key object
API Key object.
api_key_id: uuid
Unique identifier of API Key.
comment: string
Comments associated with API Key.
scopes: array
Array of scopes associated with API Key.
tags: array
Array of tags associated with API Key. If no tags exist, this item will not be returned.
created: string
Date and time when API Key was created.
expiration_date: string
Date and time when API Key will expire. If no expiration date exists, this item will not be returned.
Create Key
Creates a new key in the specified project. You must create your first API Key using the Deepgram Console.
Required account/project scope(s): keys:write
.
Path Parameters
Unique identifier of the project for which you want to create a key. Required.
Request Body Schema
Comments associated with the key you would like to create. Must be between 1 and 128 characters long, not including whitespace. Required.
Tags associated with the key you would like to create. Optional.
Scopes for the key you would like to create. Required.
Scopes cannot be empty.
The requested account scopes for the new key cannot exceed the scopes of the authenticated account making the request.
The requested project scopes for the new key cannot exceed the scopes of the authenticated account making the request.
Date on which the key you would like to create should expire. Optional.
If no time zone is specified, defaults to Coordinated Universal Time (UTC).
For each key, you may specify either an expiration_date
or a time_to_live_in_seconds
, but not both.
Length of time (in seconds) during which the key you would like to create will remain valid. Optional.
For each key, you may specify either an expiration_date
or a time_to_live_in_seconds
, but not both.
Responses
Status | Description |
---|---|
201 Success | Key created. |
Response Schema
api_key_id: uuid
Unique identifier of the API Key.
key: string
Value of the API Key. This is the only chance to read the Key value; it cannot be recovered later.
comment: string
Comments associated with the API Key.
scopes: array
Project scopes associated with the API Key.
tags: array
Tags associated with the API Key. If no tags exist, this item will not be returned.
created: string
Date and time when the API Key was created.
expiration_date: string
Date and time when the API Key expires. If no expiration date exists, this item will not be returned.
Delete Key
Deletes the specified key in the specified project. If the authenticated account has access to the
members:write
, admins:write
, and owners:write
project scopes, it will
delete any key in the project. Otherwise, it will delete only keys belonging to the authenticated account.
Required account/project scope(s): keys:write
. Optional project scope(s):
members:write
, admins:write
, owners:write
.
Path Parameters
Unique identifier of the project that contains the key that you want to delete.
Unique identifier of the key that you want to delete.
Responses
Status | Description |
---|---|
200 Success | Key deleted. |
Response Schema
message: string
Message returned. Should read "Successfully deleted the API key!"
Members
Members are users who have been given access to a specified Deepgram Project. Members are assigned Scopes, which determine what they can do in their assigned Project. Members can be assigned to multiple Projects and have different Scopes for each Project.
Get Members
Retrieves account objects for all of the accounts in the specified project.
Required account scope(s): project:read
. Required project scope(s): project:read
,
members:read
, admins:read
, owners:read
.
Path Parameters
Unique identifier of the project for which you want to get members.
Responses
Status | Description |
---|---|
200 Success | Members found. |
Response Schema
members: array
Array of Members.
member_id: uuid
Unique identifier of member.
first_name: string
First name of member. Optional.
last_name: string
Last name of member. Optional.
scopes: array
Project scopes associated with member.
email: string
Email address of member.
Remove Member
Removes the specified account from the specified project. API keys created by this member for the specified project will also be deleted.
If the account being removed has scope member
, then the requesting account must have scope
members:write:kick
. If the account being removed has scope admin
, then the
requesting account must have scope
admins:write:kick
. If the account being removed has scope owner
, then the
requesting account must have scope owners:write:kick
. The account being removed must not be the
sole account with
the scope
owner
.
Required account scope(s): project:write
. Required project scopes(s):
members:write:kick
or admins:write:kick
or owners:write:kick
.
Path Parameters
Unique identifier of the project that contains the account you want to remove.
Unique identifier of the account that you want to remove.
Responses
Status | Description |
---|---|
200 Success | Account removed. |
Scopes
Scopes are permissions required to perform actions in a specified Deepgram Project. Scopes can be associated with Keys or Users. When associated with a Key, Scopes determine the actions that the Key can be used to perform in the Key's associated Project. When associated with a Member, Scopes determine what the user can do in their assigned Project. For more information, see Working with Roles.
Get Member Scopes
Lists the specified project scopes assigned to the specified member. If the authenticated account has
access to the members:read:scopes
, admins:read:scopes
, and
owners:read:scopes
project scopes,
it will retrieve project scopes for any member of the specified project. Otherwise, it will retrieve project
scopes for only the authenticated account.
Required account scope(s): account:read
, project:read
. Required project scope(s):
project:read
. Optional project scope(s): members:read:scopes
,
admins:read:scopes
, owners:read:scopes
.
Path Parameters
Identifier of the project that contains the member for whom you want to get scopes.
Unique identifier of the member for whom you want to get scopes.
Responses
Status | Description |
---|---|
200 Success | Scopes found. |
Response Schema
scopes: array
Array of scopes associated with the member.
Update Scope
Updates the specified project scopes assigned to the specified member.
If the specified member has the scope member
or the scope being added is member
,
the requesting account must have the scope members:write:scopes
. If the specified member has
the scope
admin
or the scope being added is admin
, the requesting account must have the
scope
admins:write:scopes
. If the specified member has the scope owner
or the scope
being added is owner
, the requesting account must have the scope
owners:write:scopes
.
If the scope being added is member
, admin
, or owner
, it will replace
the existing member
, admin
, or owner
scope of the specified member
unless the specified
member is the only member with the owner
scope. In this case, the request will fail.
If the scope being added is not member
, admin
, or owner
, then the
requesting account must also have the scope that it is trying to add to the specified member. For example,
if the requesting account
tries to add the project:write:settings
project scope to a specified member, but the requesting
account itself does not have the scope project:write:settings
, then the request will fail.
Required account scope(s): project:write
. Optional project scope(s): See the description.
Path Parameters
Unique identifier of the project that contains the specified member and scope that you want to update.
Unique identifier of the member for whom you want to update the scope.
Request Body Schema
Scope for the specified member and project.
Responses
Status | Description |
---|---|
200 Success | Scope updated. |
Response Schema
message: string
Success message.
Invitations
To add a Member to a Deepgram Project, you can use the Deepgram Console or the Deepgram API to send an Invitation. When sending an Invitation, you provide an email address for the Member, and Deepgram generates an invitation link and emails it to them.
Leave Project
Removes the authenticated account from the specified project. Will also delete API Keys created by the
account for the specified project. If the authenticated account is the only account on the project with the
scope
owner
, the call will fail.
Required account scope(s): project:write
.
Path Parameters
Unique identifier of the project from which you want to remove the authenticated account.
Responses
Status | Description |
---|---|
200 Success | Account removed from project. |
Response Schema
message: string
Success message.
Usage
Deepgram tracks requests made to Deepgram services and associates usage information with Projects.
Get All Requests
Generates a list of requests sent to the Deepgram API for the specified project over a given time range. Uses pagination to limit the number of results returned.
Path Parameters
Unique identifier of the project for which you want to retrieve requests.
Query Parameters
Start date of the requested date range. Format is YYYY-MM-DD. Defaults to the time of your first request.
If no time zone is specified, defaults to Coordinated Universal Time (UTC).
End date of the requested date range. Format is YYYY-MM-DD. Defaults to the current time.
If no time zone is specified, defaults to Coordinated Universal Time (UTC).
Number of results to return per page.
Default: 10
Status of requests to return. Enables you to filter requests depending on whether they have succeeded or failed. If not specified, returns requests with all statuses.
Default: null
Possible values: null
, succeeded
OR failed
Responses
Status | Description |
---|---|
200 Success | Requests found. Requests will be returned in descending order based on creation time (newest first). |
Response Schema
page: integer
Page number that should be returned. Used for pagination.
limit: integer
Number of results to return per page. Used for pagination.
requests: array
request_id: uuid
Unique identifier of request.
created: string
Date/time when request was created.
path: string
Path of endpoint to which request was submitted.
api_key_id: uuid
Unique identifier of the API Key with which the request was submitted.
response: object
Response generated by the request. If a response has not yet been generated, this object will be empty.
details: object
If the request failed, then this object will be replaced by an optional
message
field that may contain an error message.usd: number
Cost of the request in USD, if project is non-contract and the requesting account has appropriate permissions.
duration: number
Length of time (in hours) of audio processed in the request.
total_audio: number
Number of audio files processed in the request.
channels: integer
Number of channels in the audio associated with the request.
streams: integer
Number of audio streams associated with the request.
models: array
Array of models applied when running the request.
method: string
Processing method used when running the request.
tags: array
Array of tags applied when running the request.
features: array
Array of features used when running the request.
config: object
Configuration used when running the request.
alternatives: integer
Requested maximum number of transcript alternatives to return. If no alternative value was submitted, this item will not be returned.
diarize: boolean
Indicates whether diarization was requested. If not requested, this item will not be returned.
keywords: array
Array of keywords associated with the request. If no keywords were submitted, this item will not be returned.
language: string
Language associated with the request. If no languages were submitted, this item will not be returned.
model: string
Model associated with the request. If no models were submitted, this item will not be returned.
multichannel: boolean
Indicates whether multichannel processing was requested. If not requested, this item will not be returned.
ner: boolean
Indicates whether named-entity recognition (NER) was requested. If not requested, this item will not be returned.
numerals: boolean
Indicates whether numeral conversion was requested. If not requested, this item will not be returned.
profanity_filter: boolean
Indicates whether filtering profanity was requested. If not requested, this item will not be returned.
punctuate: boolean
Indicates whether punctuation was requested. If not requested, this item will not be returned.
redact: array
Indicates whether redaction was requested. If not requested, this item will not be returned.
search: array
Array of seach terms associated with the request. If no search terms were submitted, this item will not be returned.
utterances: boolean
Indicates whether utterance segmentation was requested. If not requested, this item will not be returned.
code: integer
completed: string
callback: object
Only exists if a callback was included in the request. If a callback was included in the request, but the request has not completed yet, then this object will exist, but it will be empty. The
attempts
item will exist only while the request has not completed. Thecompleted
item will exist only after the request completes.attempts: integer
code: integer
completed: string
Get Request
Retrieves the details of the specified request sent to the Deepgram API for the specified project.
Path Parameters
Unique identifier of the project for which you want to retrieve the specified request.
Unique identifier of the request that you want to retrieve.
Responses
Status | Description |
---|---|
200 Success | Request found. |
404 Not Found | A request with the specified request ID in the specified project was not found. |
Response Schema
request_id: uuid
Unique identifier of request.
created: string
Date and time when request was created.
path: string
Path of endpoint to which request was submitted.
api_key_id: uuid
Unique identifier of the API Key with which the request was submitted.
response: object
Response generated by the request. If a response has not yet been generated, this object will be empty.
details: object
If the request failed, then this object will be replaced by an optional
message
field that may contain an error message.usd: number
Cost of the request in USD, if project is non-contract and the requesting account has appropriate permissions.
duration: number
Length of time (in hours) of audio processed in the request.
total_audio: number
Number of audio files processed in the request.
channels: integer
Number of channels in the audio associated with the request.
streams: integer
Number of audio streams associated with the request.
models: array
Array of models applied when running the request.
method: string
Processing method used when running the request.
tags: array
Array of tags applied when running the request.
features: array
Array of features used when running the request.
config: object
Configuration used when running the request.
alternatives: integer
Requested maximum number of transcript alternatives to return. If no alternative value was submitted, this item will not be returned.
diarize: boolean
Indicates whether diarization was requested. If not requested, this item will not be returned.
keywords: array
Array of keywords associated with the request. If no keywords were submitted, this item will not be returned.
language: string
Language associated with the request. If no languages were submitted, this item will not be returned.
model: string
Model associated with the request. If no models were submitted, this item will not be returned.
multichannel: boolean
Indicates whether multichannel processing was requested. If not requested, this item will not be returned.
ner: boolean
Indicates whether named-entity recognition (NER) was requested. If not requested, this item will not be returned.
numerals: boolean
Indicates whether numeral conversion was requested. If not requested, this item will not be returned.
profanity_filter: boolean
Indicates whether filtering profanity was requested. If not requested, this item will not be returned.
punctuate: boolean
Indicates whether punctuation was requested. If not requested, this item will not be returned.
redact: array
Indicates whether redaction was requested. If not requested, this item will not be returned.
search: array
Array of seach terms associated with the request. If no search terms were submitted, this item will not be returned.
utterances: boolean
Indicates whether utterance segmentation was requested. If not requested, this item will not be returned.
code: integer
completed: string
callback: object
Only exists if a callback was included in the request. If a callback was included in the request, but the request has not been attempted yet, then this object will exist, but it will be empty. The
attempts
item will exist only while the request has not completed. Thecompleted
item will exist only after the request completes.attempts: integer
code: integer
completed: string
Summarize Usage
Retrieves a summary of usage statistics. You can specify a date range.
Path Parameters
Unique identifier of the project for which you want to summarize usage.
Query Parameters
Start date of the requested date range. Format: YYYY-MM-DD. If a full timestamp is given, it will be truncated to a day. Dates are UTC. Defaults to the date of your first request.
End date of the requested date range. Format is YYYY-MM-DD. If a full timestamp is given, it will be truncated to a day. Dates are UTC. Defaults to the current date.
Limits results to requests made using the API key corresponding to the given accessor. If not specified, returns requests made using all API keys. To include multiple API Keys, send multiple instances in query string (e.g., accessor=2ed1307e-14f2-48a2-b996-97ea009cfa4e&accessor=1dc0296d-03e1-37z1-1885-86dz998bez3d).
Limits results to requests associated with the specified tag. If not specified, returns requests with all tags. To include multiple tags, send multiple instances in query string (e.g., tag=dev&tag=production).
Limits results to requests processed using the specified method. If not specified, returns requests with all processing methods. To include multiple methods, send multiple instances in query string (e.g., method=sync&method=streaming).
Possible values include:
sync
async
streaming
Possible values: sync
, async
OR streaming
Limits results to requests run with the specified model UUID applied. If not specified, returns requests with all models. To include multiple models, send multiple instances in query string (e.g., model=4899aa60-f723-4517-9815-2042acc12a82&model=125125fb-e391-458e-a227-a60d6426f5d6).
Limits results to requests that include the multichannel
feature.
Limits results to requests that include the interim_results
feature.
Limits results to requests that include the punctuate
feature.
Limits results to requests that include the ner
feature.
Limits results to requests that include the utterances
feature.
Limits results to requests that include the replace
feature.
Limits results to requests that include the profanity_filter
feature.
Limits results to requests that include the keywords
feature.
Limits results to requests that include the diarize
feature.
Limits results to requests that include the search
feature.
Limits results to requests that include the redact
feature.
Limits results to requests that include the alternatives
feature.
Limits results to requests that include the numerals
feature.
Responses
Status | Description |
---|---|
200 Success | Usage statistics found. |
404 Not Found | Usage statistics for the requested date range in the specified project were not found. |
Response Schema
start: string
Start date submitted for included requests.
end: string
End date submitted for included requests.
resolution: object
units: string
Units of resolution amount. For example,
days
.amount: integer
Number of resolution units.
results: array
Array of Result objects, one for each day within the date range that has associated requests.
start: string
Start date of included requests.
end: integer
End date of included requests.
hours: number
Length of time (in hours) of audio submitted in included requests.
total_hours: number
Length of time (in hours) of audio processed in included requests. For example, if submitted audio is multichannel,
total_hours
processed will be greater thanhours
submitted because multiple channels are processed separately.requests: integer
Number of included requests.
Get Fields
Lists the features, models, tags, languages, and processing method used for requests in the specified project. You can specify a time range.
Path Parameters
Unique identifier of the project for which you want to retrieve fields.
Query Parameters
Start date of the requested date range. Format is YYYY-MM-DD. If a full timestamp is given, it will be truncated to a day. Dates are UTC.
Defaults to the date of your first request.
End date of the requested date range. Format is YYYY-MM-DD. If a full timestamp is given, it will be truncated to a day. Dates are UTC.
Defaults to the current date.
Responses
Status | Description |
---|---|
200 Success | Request fields found. |
Response Schema
tags: array
Array of included tags.
models: array
Array of included models.
processing_methods: array
Array of processing methods.
Possible values:
sync
,async
ORstreaming
languages: array
Array of included languages.
features: array
Array of included features.
Possible values:
multichannel
,interim_results
,punctuate
,ner
,utterances
,replace
,profanity_filter
,keywords
,diarize
,search
,redact
,alternatives
ORnumerals
Billing
Deepgram tracks individual transactions and transaction summaries for Deepgram services.
Get All Balances
Generates a list of outstanding balances for the specified project. To see balances, the authenticated
account must be a project owner
or administrator
.
Path Parameters
Unique identifier of the project for which you want to retrieve outstanding balances.
Responses
Status | Description |
---|---|
200 success | Balances found. |
Response Schema
balances: array
Array of balance objects.
balance_id: uuid
Unique identifier of the balance.
amount: number
Amount of the balance.
units: string
Units of the balance. May use
usd
orhour
, depending on the project billing settings.purchase: string
Unique identifier of the purchase order associated with the balance.
Get Balance
Retrieves details about the specified balance. To see balances, the authenticated account must be a project
owner
or administrator
.
Path Parameters
Unique identifier of the project for which you want to retrieve the specified balance.
Unique identifier of the balance that you want to retrieve.
Responses
Status | Description |
---|---|
200 Success | Balance found. |
404 Not Found | A balance with the specified balance ID in the specified project was not found. |
Response Schema
balance_id: uuid
Unique identifier of the balance.
amount: number
Amount of the balance.
units: string
Units of the balance. May use
usd
orhour
, depending on the project billing settings.purchase: string
Unique identifier of the purchase order associated with the balance.
FEEDBACK
Did you find what you were looking for?