Live Audio
Transcribe audio using Deepgram’s STT WebSocket
HandshakeTry it
Headers
API key for authentication. Format should be be either ‘token <DEEPGRAM_API_KEY>’ or ‘Bearer <JWT_TOKEN>’
Query parameters
Defaults to false
. Recognize speaker changes. Each word in the transcript will be assigned a speaker number starting at 0
Indicates how long Deepgram will wait to detect whether a speaker has finished speaking or pauses for a significant period of time. When set to a value, the streaming endpoint immediately finalizes the transcription for the processed time range and returns the transcript with a speech_final parameter set to true. Can also be set to false to disable endpointing
Arbitrary key-value pairs that are attached to the API response for usage in downstream processing
Key term prompting can boost or suppress specialized terminology and brands. Only compatible with Nova-3
The BCP-47 language tag that hints at the primary spoken language. Depending on the Model you choose only certain languages are available
Opts out requests from the Deepgram Model Improvement Program. Refer to our Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip
Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely
Sample rate of submitted audio. Required (and only read) when a value is provided for encoding
Indicates how long Deepgram will wait to send an UtteranceEnd message after a word has been transcribed. Use with interim_results
Send
Receive
The server will process all remaining audio data and return the final results. You may receive a response with the from_finalize attribute set to true, indicating that the finalization process is complete. This response typically occurs when there is a noticeable amount of audio buffered in the server.
Provides real-time metadata during audio streaming, including audio characteristics and processing details. This response is sent periodically during streaming to provide updates about the audio being processed.