Bit Rate | Deepgram's Docs

bit_rate string

Text to Speech Request Text to Speech Stream English Only

The Bit Rate feature gives users the ability to specify the desired bitrate of the resulting text-to-speech audio output.

When generating audio from text, having control over the bitrate can significantly impact the quality and file size of the output. Bit Rate determines the amount of data processed per unit of time and is typically measured in bits per second (bps).

The bitrate value must adhere to the Audio Format Combinations table. Choose a value based on the encoding type and your use case. When applicable 48000 will be used as the default.

Enable Feature

To enable the Bit Rate feature, include the bit_rate parameter in the query string with the desired bitrate value.

Example:

Text

https://api.deepgram.com/v1/speak?encoding=mp3&bit_rate=32000

CURL Example

You can use the following cURL command in a terminal or your favorite API client to synthesize text into speech with a specific bitrate.

MP3 encoding with bitrate of 32 kbits/second::

cURL

$ curl --request POST \
>      --url "https://api.deepgram.com/v1/speak?model=aura-2-thalia-en&encoding=mp3&bit_rate=32000" \
>      --header "Authorization: Token DEEPGRAM_API_KEY" \
>      --header 'Content-Type: application/json' \
>      --data '{"text": "Hello, how can I help you today?"}' \
>      --output outputfile_mp3_32000.mp3 \
>      --write-out "Time-to-First-Byte: %{time_starttransfer}s Time-to-Last-Byte: %{time_total}s\n" \
>      --fail-with-body \
>      --silent || echo "Request failed"

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

Query Parameters

Parameter	Value	Type	Description
`bit_rate`	See list of supported audio format combinations in the Audio Format Combinations table.	`string`	The desired bitrate for the output audio.

Analyze Response

Upon successful processing of the request, you will receive an audio file containing the synthesized text-to-speech output, along with response headers providing additional information.

The audio file is streamed back to you, so you may begin playback as soon as the first byte arrives. Read the guide Streaming Audio Outputs to learn how to begin playing the stream immediately versus waiting for the entire file to arrive.

Response Headers Example

http

HTTP/1.1 200 OK
< content-type: audio/mpeg
< dg-model-name: aura-2-thalia-en
< dg-model-uuid: e4979ab0-8475-4901-9d66-0a562a4949bb
< dg-char-count: 32
< dg-request-id: bf6fc5c7-8f84-479f-b70a-602cf5bf18f3
< transfer-encoding: chunked
< date: Thu, 29 Feb 2024 19:20:48 GMT

To see these response headers when making a CURL request, add -v or --verbose to your request.

This includes:

content-type: Specifies the media type of the resource, in this case, audio/mpeg, indicating the format of the audio file returned.
dg-request-id: A unique identifier for the request, useful for debugging and tracking purposes.
dg-model-uuid: The unique identifier of the model that processed the request.
dg-char-count: Indicates the number of characters that were in the input text for the text-to-speech process.
dg-model-name: The name of the model used to process the request.
transfer-encoding: Specifies the form of encoding used to safely transfer the payload to the recipient.
date: The date and time the response was sent.