Diarization

Diarize recognizes speaker changes and assigns a speaker to each word in the transcript.

diarize boolean Default: false

🛝

Try this feature out in our API Playground!

Enable Feature

To enable Diarization, use the following parameter in the query string when you call Deepgram’s /listen endpoint :

diarize=true

To transcribe audio from a file on your computer, run the following cURL command in a terminal or your favorite API client.

curl \
  --request POST \
  --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
  --header 'Content-Type: audio/wav' \
  --data-binary @youraudio.wav \
  --url 'https://api.deepgram.com/v1/listen?diarize=true'

🚧

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

Analyze Response

📘

For this example, we use an MP3 audio file that contains the beginning of a customer call with Premier Phone Services. If you would like to follow along, you can download it.

When the file is finished processing, you’ll receive a JSON response. Let's look more closely at the words object within the alternatives object within this response.

Pre-Recorded

When using diarization for pre-recorded audio, both speaker and speaker_confidence values will be returned:

...
"alternatives":[
  {
    ...
    "words": [
      {
        "word":"hello",
        "start":15.259043,
        "end":15.338787,
        "confidence":0.9721591,
        "speaker":0,
        "speaker_confidence":0.5853265
      },
    ...
    ]
  }
]

Live Streaming

When using diarization for live streaming audio, only the speaker value will be returned:

...
"alternatives":[
  {
    ...
    "words": [
      {
        "word":"hello",
        "start":15.259043,
        "end":15.338787,
        "confidence":0.9721591,
        "speaker":0
      },
    ...
    ]
  }
]

Use the API reference or the API Playground to view the detailed response.

Format Response

To improve readability, you can use a JSON processor to parse the JSON. In this example, we use JQ and further improve readability by turning on Deepgram’s punctuation and utterances features:

curl \
  --request POST \
  --url 'https://api.deepgram.com/v1/listen?diarize=true&punctuate=true&utterances=true' \
  --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
  --header 'content-type: audio/mp3' \
  --data-binary @Premier_broken-phone_numbers.mp3 | jq -r ".results.utterances[] | \"[Speaker:\(.speaker)] \(.transcript)\""

🚧

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

When the file is finished processing, you’ll receive the following response:

[Speaker:0] Hello, and thank you for calling premier phone service. Please be aware that this call may be recorded for quality and training purposes.
[Speaker:0] My name is Beth, and I will be assisting you today. How are you doing?
[Speaker:1] Not too bad. How are you today?
[Speaker:0] I'm doing well. Thank you. May I please have your name?
[Speaker:1] My name is Blake...

📘

To learn more about when to use Deepgram's Diarization or Multichannel feature, see When to Use the Multichannel and Diarization Features.