Sample Rate

Sample Rate specifies the sample rate for the output audio.

sample_rate stringEnglish

The Sample Rate feature allows users to specify the desired sample rate of the resulting text-to-speech audio output. Sample rate refers to the number of samples of audio carried per second, measured in Hertz (Hz).

Choosing the appropriate sample rate is crucial as it directly impacts the audio quality and file size of the output. Higher sample rates typically result in better audio quality but may increase the file size, while lower sample rates may reduce file size but can compromise audio fidelity.

🚧

The sample_rate value must adhere to the Audio Format Combinations table. Choose a value based on the encoding type and your use case.

Enable Feature

To enable the Sample Rate feature, include the sample_rate parameter in the query string with the desired sample rate value.

Example:

https://api.deepgram.com/v1/speak?encoding=linear16&sample_rate=24000

📘

If you do not specify a model, the default model aura-asteria-en will be used.

CURL Example

You can use the following cURL command in a terminal or your favorite API client to synthesize text into speech with a specific sample rate.

Sample rate of 24 kHz:

curl --request POST \
     --url "https://api.deepgram.com/v1/speak?model=aura-asteria-en&encoding=linear16&sample_rate=24000" \
     --header "Authorization: Token DEEPGRAM_API_KEY" \
     --header 'Content-Type: application/json' \
     --output sample_rate_24000.wav \
     --data '{"text": "Hello, how can I help you today?"}'
     --fail-with-body \
     --silent \
     || (jq . sample_rate_24000.wav && rm sample_rate_24000.wav)

Query Parameters

ParameterValueTypeDescription
sample_rateSee list of supported audio format combinations in the Audio Format Combinations table.stringThe desired sample rate for the output audio.

Analyze Response

Upon successful processing of the request, you will receive an audio file containing the synthesized text-to-speech output, along with response headers providing additional information.

📘

The audio file is streamed back to you, so you may begin playback as soon as the first byte arrives. Read the guide Streaming Audio Outputs to learn how to begin playing the stream immediately versus waiting for the entire file to arrive.

Response Headers Example

HTTP/1.1 200 OK
< content-type: audio/mpeg
< dg-model-name: aura-asteria-en
< dg-model-uuid: e4979ab0-8475-4901-9d66-0a562a4949bb
< dg-char-count: 32
< dg-request-id: bf6fc5c7-8f84-479f-b70a-602cf5bf18f3
< transfer-encoding: chunked
< date: Thu, 29 Feb 2024 19:20:48 GMT

📘

To see these response headers when making a CURL request, add -v or --verbose to your request.

This includes:

  • content-type: Specifies the media type of the resource, in this case, audio/mpeg, indicating the format of the audio file returned.
  • dg-request-id: A unique identifier for the request, useful for debugging and tracking purposes.
  • dg-model-uuid: The unique identifier of the model that processed the request.
  • dg-char-count: Indicates the number of characters that were in the input text for the text-to-speech process.
  • dg-model-name: The name of the model used to process the request.
  • transfer-encoding: Specifies the form of encoding used to safely transfer the payload to the recipient.
  • date: The date and time the response was sent.

API Error Responses

📘

For information on Deepgram's error messages and error codes, read the API Reference Errors page.


🌈

We'd love to get your feedback on Deepgram's Aura text-to-speech. You will receive $50 in additional console credits within two weeks after filling out the form, and you may be invited to join a group of users with access to the latest private releases. To fill out the form, Click Here.