Speech Started sends a message when the start of speech is detected in live streaming audio.

The Speech Started feature provided by Deepgram offers a solution to detect the start of speech while transcribing live streaming audio.

SpeechStarted leverages the Voice Activity Detector (VAD) to promptly detect the start of speech post-silence. By gauging tonal nuances in human speech, the VAD can effectively differentiate between silent and non-silent audio segments, providing immediate notification of speech detection.

When this feature is enabled, Deepgram will send a message when the onset of speech is detected.

Enable Feature

To enable the SpeechStarted event, include the parameter vad_events=true in your request:


You'll then begin receiving messages upon speech starting.


The JSON message sent when the start of speech is detected looks similar to this:

  "type": "SpeechStarted",
  "channel": [
  "timestamp": 9.54
  • The type field is always SpeechStarted for this event.
  • The channel field is interpreted as [A,B], where A is the channel index, and B is the total number of channels. The above example is channel 0 of single-channel audio.
  • The timestamp field is the time at which speech was first detected.


The timestamp is not intended to match precisely with the first word's timestamp in the subsequent transcript, as the ASR and word-timing models are separate from the VAD speech detection.