Endpointing
Deepgram’s Endpointing feature monitors incoming streaming audio and detects when a user has finished speaking or paused for a significant amount of time, indicating the completion of an idea. When Deepgram detects an endpoint, it assumes that no additional data will improve its prediction, so it immediately finalizes its results for the processed time range and returns the transcript with a speech_final
parameter set to true
.
Endpointing relies on the Voice Activity Detection (VAD) feature, which monitors the incoming audio and triggers when a sufficiently long pause is detected. You can customize the length of time used to detect whether a speaker has finished speaking by submitting the vad_turnoff
parameter when Endpointing is enabled. By default, Deepgram uses 10 milliseconds.
Endpointing can be used with Deepgram's Interim Results feature. To compare and contrast these features, and to explore best practices for using them together, see Using Endpointing and Interim Results with Live Streaming Audio.
Use Cases
Some examples of use cases for Endpointing include:
- Determining whether a speaker has finished speaking.
Enable Feature
To enable endpointing, when you call Deepgram’s API, add an endpointing
parameter set to true
in the query string:
endpointing=true
For an example of audio streaming, see Getting Started with Streaming Audio.
Results
When enabled, the transcript for each received streaming response shows a key called speech_final
.
{
"channel_index":[
0,
1
],
"duration":1.039875,
"start":0.0,
"is_final":false,
"speech_final":false,
"channel":{
"alternatives":[
{
"transcript":"another big",
"confidence":0.9600255,
"words":[
{
"word":"another",
"start":0.2971154,
"end":0.7971154,
"confidence":0.9588303
},
{
"word":"big",
"start":0.85173076,
"end":1.039875,
"confidence":0.9600255
}
]
}
]
}
}
...
When speech_final
is set to true
, Deepgram has detected an endpoint and immediately finalized its results for the processed time range.
By default, Deepgram applies its general AI model, which is a good, general purpose model for everyday situations. To learn more about the customization possible with Deepgram's API, check out the Deepgram API Reference.
FEEDBACK
Did you find what you were looking for?