Continuous Text Stream | Deepgram's Docs

Convert text into natural-sounding speech using Deepgram’s TTS WebSocket

HandshakeTry it

WSS

/v1/speak

Headers

AuthorizationstringRequired

Use your API key for authentication, or alternatively generate a temporary token and pass it via the token query parameter.

Example: token %DEEPGRAM_API_KEY% or bearer %DEEPGRAM_TOKEN%

Query parameters

encodingenumOptionalDefaults to linear16

Encoding allows you to specify the expected encoding of your audio output for streaming TTS. Only streaming-compatible encodings are supported.

Allowed values:

mip_opt_outanyOptional

Opts out requests from the Deepgram Model Improvement Program. Refer to our Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip

modelenumOptionalDefaults to aura-asteria-en

AI model used to process submitted text

sample_rateenumOptionalDefaults to 24000

Sample Rate specifies the sample rate for the output audio. Based on encoding 8000 or 24000 are possible defaults. For some encodings sample rate is not configurable.

Allowed values:

Send

SpeakV1TextobjectRequired

Text to convert to audio

SpeakV1FlushobjectRequired

Flush the buffer and receive the final audio for text sent so far

SpeakV1ClearobjectRequired

Clear the buffer and start a new audio generation. Potentially destructive operation for any text in the buffer

SpeakV1CloseobjectRequired

Flush the buffer and close the connection gracefully after all audio is generated

Receive

SpeakV1AudiostringRequiredformat: "binary"

Receive audio chunks as they are generated

SpeakV1MetadataobjectRequired

Receive metadata about the audio generation

SpeakV1FlushedobjectRequired

Receive metadata about the audio generation

SpeakV1ClearedobjectRequired

Receive metadata about the audio generation

SpeakV1WarningobjectRequired

Receive a warning about the audio generation

URL	wss://api.deepgram.com/v1/speak
Method	GET
Status	101 Switching Protocols