Diarization

Diarize recognizes speaker changes and assigns a speaker to each word in the transcript.

diarize boolean Default: false

Deepgram API Playground Try this feature out in our API Playground!

Enable Feature

To enable Diarization, use the following parameter in the query string when you call Deepgram’s /listen endpoint :

diarize=true

To transcribe audio from a file on your computer, run the following cURL command in a terminal or your favorite API client.

cURL
$curl \
> --request POST \
> --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
> --header 'Content-Type: audio/wav' \
> --data-binary @youraudio.wav \
> --url 'https://api.deepgram.com/v1/listen?diarize=true'

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

Analyze Response

For this example, we use an MP3 audio file that contains the beginning of a customer call with Premier Phone Services. If you would like to follow along, you can download it.

When the file is finished processing, you’ll receive a JSON response. Let’s look more closely at the words object within the alternatives object within this response.

Pre-Recorded

When using diarization for pre-recorded audio, both speaker and speaker_confidence values will be returned:

JSON
1...
2"alternatives":[
3 {
4 ...
5 "words": [
6 {
7 "word":"hello",
8 "start":15.259043,
9 "end":15.338787,
10 "confidence":0.9721591,
11 "speaker":0,
12 "speaker_confidence":0.5853265
13 },
14 ...
15 ]
16 }
17]

Live Streaming

When using diarization for live streaming audio, only the speaker value will be returned:

JSON
1...
2"alternatives":[
3 {
4 ...
5 "words": [
6 {
7 "word":"hello",
8 "start":15.259043,
9 "end":15.338787,
10 "confidence":0.9721591,
11 "speaker":0
12 },
13 ...
14 ]
15 }
16]

Use the API reference or the API Playground to view the detailed response.

Format Response

To improve readability, you can use a JSON processor to parse the JSON. In this example, we use JQ and further improve readability by turning on Deepgram’s punctuation and utterances features:

cURL
$curl \
> --request POST \
> --url 'https://api.deepgram.com/v1/listen?diarize=true&punctuate=true&utterances=true' \
> --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
> --header 'content-type: audio/mp3' \
> --data-binary @Premier_broken-phone_numbers.mp3 | jq -r ".results.utterances[] | \"[Speaker:\(.speaker)] \(.transcript)\""

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

When the file is finished processing, you’ll receive the following response:

[Speaker:0] Hello, and thank you for calling premier phone service. Please be aware that this call may be recorded for quality and training purposes.
[Speaker:0] My name is Beth, and I will be assisting you today. How are you doing?
[Speaker:1] Not too bad. How are you today?
[Speaker:0] I'm doing well. Thank you. May I please have your name?
[Speaker:1] My name is Blake...

To learn more about when to use Deepgram’s Diarization or Multichannel feature, see When to Use the Multichannel and Diarization Features.


What’s Next

Built with