Live Audio
Transcribe audio using Deepgramās STT WebSocket
HandshakeTry it
Headers
API key for authentication. Format should be be either ātoken <DEEPGRAM_API_KEY>ā or āBearer <JWT_TOKEN>ā
Query parameters
Defaults to false
. Recognize speaker changes. Each word in the transcript will be assigned a speaker number starting at 0
Indicates how long Deepgram will wait to detect whether a speaker has finished speaking or pauses for a significant period of time. When set to a value, the streaming endpoint immediately finalizes the transcription for the processed time range and returns the transcript with a speech_final parameter set to true. Can also be set to false to disable endpointing
Arbitrary key-value pairs that are attached to the API response for usage in downstream processing
Key term prompting can boost specialized terminology and brands. Only compatible with Nova-3
Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely
Sample rate of submitted audio. Required (and only read) when a value is provided for encoding
Indicates how long Deepgram will wait to send an UtteranceEnd message after a word has been transcribed. Use with interim_results
Send
Receive
The server will process all remaining audio data and return the final results. You may receive a response with the from_finalize attribute set to true, indicating that the finalization process is complete. This response typically occurs when there is a noticeable amount of audio buffered in the server.
Provides real-time metadata during audio streaming, including audio characteristics and processing details. This response is sent periodically during streaming to provide updates about the audio being processed.