1. Documentation
  2. Features
  3. Endpointing



Deepgram’s Endpointing feature monitors incoming streaming audio and detects when a user has finished speaking or paused for a significant amount of time, indicating the completion of an idea. When Deepgram detects an endpoint, it assumes that no additional data will improve its prediction, so it immediately finalizes its results for the processed time range and returns the transcript with a speech_final parameter set to true.

Endpointing relies on the Voice Activity Detection (VAD) feature, which monitors the incoming audio and triggers when a sufficiently long pause is detected. You can customize the length of time used to detect whether a speaker has finished speaking by submitting the vad_turnoff parameter when Endpointing is enabled. By default, Deepgram uses 10 milliseconds.

Endpointing can be used with Deepgram's Interim Results feature. To compare and contrast these features, and to explore best practices for using them together, see Using Endpointing and Interim Results with Live Streaming Audio.

Use Cases

Some examples of use cases for Endpointing include:

  • Determining whether a speaker has finished speaking.

Enable Feature

To enable endpointing, when you call Deepgram’s API, add an endpointing parameter set to true in the query string:


For an example of audio streaming, see Getting Started with Streaming Audio.


When enabled, the transcript for each received streaming response shows a key called speech_final.

        "transcript":"another big",

When speech_final is set to true, Deepgram has detected an endpoint and immediately finalized its results for the processed time range.

By default, Deepgram applies its general AI model, which is a good, general purpose model for everyday situations. To learn more about the customization possible with Deepgram's API, check out the Deepgram API Reference.