Endpointing
Deepgram’s Endpointing feature monitors incoming streaming audio and detects sufficiently long pauses that are likely to represent an endpoint in speech. When Deepgram detects an endpoint, it assumes that no additional data will improve its prediction, so it immediately finalizes its results for the processed time range and returns the transcript with a speech_final
parameter set to true
.
Endpointing can be used with Deepgram’s Interim Results feature. To compare and contrast these features, and to explore best practices for using them together, see Using Endpointing and Interim Results with Live Streaming Audio.
Use Cases
Some examples of use cases for Endpointing include:
- Returning finalized transcripts as soon as possible when a break in speech is detected.
Enable Feature
To enable endpointing, when you call Deepgram’s API, add an endpointing
parameter set to true
in the query string:
endpointing=true
For an example of audio streaming, see Getting Started with Streaming Audio.
Results
When enabled, the transcript for each received streaming response shows a key called speech_final
.
{
"channel_index":[
0,
1
],
"duration":1.039875,
"start":0.0,
"is_final":false,
"speech_final":false,
"channel":{
"alternatives":[
{
"transcript":"another big",
"confidence":0.9600255,
"words":[
{
"word":"another",
"start":0.2971154,
"end":0.7971154,
"confidence":0.9588303
},
{
"word":"big",
"start":0.85173076,
"end":1.039875,
"confidence":0.9600255
}
]
}
]
}
}
...
When speech_final
is set to true
, Deepgram has detected an endpoint and immediately finalized its results for the processed time range.
By default, Deepgram applies its general AI model, which is a good, general purpose model for everyday situations. To learn more about the customization possible with Deepgram’s API, check out the Deepgram API Reference.