Deepgram’s Utterance Split feature monitors incoming audio and detects when a sufficiently long pause is detected between words. By default, the length of time Deepgram uses for Utterance Split is 0.8 seconds, but you can configure this value using the
Utterance Split is used when the Utterances feature is enabled for pre-recorded audio.
Some examples of use cases for Utterance Split include:
- Audio with elderly speakers who pause longer while speaking than the average speaker.
- Audio with differently-abled speakers who pause longer while speaking than the average speaker.
- Audio with speakers who speak with shorter pauses than the average speaker.
To enable Utterance Split, when you call Deepgram’s API, add an
utt_split parameter in the query string and set it to the length of time (in seconds) of silence between words after which Deepgram will decide that a new utterance should begin. The default values is 0.8 s.
To transcribe audio from a file on your computer, run the following cURL command in a terminal or your favorite API client:
curl \ --request POST \ --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \ --header 'Content-Type: audio/wav' \ --data-binary @youraudio.wav \ --url 'https://api.deepgram.com/v1/listen?utt_split=20&utterances=true'
To learn about the results, see Utterances.
Did you find what you were looking for?