For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Ask AIPlaygroundLoginFree API Key
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
    • Getting Started with Speech to Text
  • Pre-Recorded Audio
    • Getting Started
    • Feature Overview
    • Template Apps
  • Streaming Audio
    • Compare Flux to Nova-3
  • Models and Languages
    • Models & Languages Overview
    • Languages Support
    • Language Detection
    • Multilingual Codeswitching
    • Model Options
    • Version
  • Formatting
    • Speaker Diarization
    • Dictation
    • Filler Words
    • Measurements
    • Numerals
    • Paragraphs
    • Profanity Filtering
    • Punctuation
    • Redaction
    • Smart Formatting
    • Supported Entity Types
    • Utterances
    • Utterance Split
  • Custom Vocabulary
    • Find and Replace
    • Keyterm Prompting
    • Keywords
    • Search
  • Media Input Settings
    • Channels
    • Encoding
    • Multichannel
    • Sample Rate
  • Results Processing
    • Understanding Word Confidence Scores
    • STT Callback
    • STT Tagging
    • Extra Metadata
  • Migrating
    • Migrating From Amazon Web Services (AWS) Transcribe to Deepgram
    • Migrating From Google Speech-to-Text (STT) to Deepgram
    • Migrating From OpenAI Whisper to Deepgram
    • Migrating from AssemblyAI Speech-to-Text to Deepgram
LogoLogo
Ask AIPlaygroundLoginFree API Key
On this page
  • Diarization Models
  • Choosing a Model
  • Enable Feature
  • Using diarize_model (recommended)
  • Using diarize (legacy)
  • Versioning Behavior
  • Model Compatibility
  • Streaming
  • Analyze Response
  • Pre-Recorded
  • Live Streaming
  • Format Response
Formatting

Speaker Diarization

Diarize recognizes speaker changes and assigns a speaker to each word in the transcript.
Was this page helpful?
Previous

Dictation

Dictation automatically formats spoken commands for punctuation into their respective punctuation marks.
Next
Built with
Deepgram API Playground
Try this feature out in our API Playground.

Pre-recorded Streaming:NovaStreaming: Flux All available languages

Diarization Models

Deepgram offers versioned diarization models. Use the diarize_model parameter to select a specific version:

ValueDescription
latestResolves to the latest GA batch diarizer (currently v2)
v2Pins to the v2 diarizer
v1Pins to the v1 diarizer

Specifying diarize_model both enables diarization and selects the model version. You do not need to also set diarize=true.

Choosing a Model

  • New integrations: Use diarize_model=latest to always get the newest available diarizer.
  • Pin a specific version: Use diarize_model=v1 or diarize_model=v2.
  • Streaming: diarize_model is not accepted on streaming. Use diarize=true for streaming diarization (see Streaming below).

Enable Feature

Using diarize_model (recommended)

Use the diarize_model parameter to enable diarization and select the model version in a single parameter:

cURL
$curl \
> --request POST \
> --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
> --header 'Content-Type: audio/wav' \
> --data-binary @youraudio.wav \
> --url 'https://api.deepgram.com/v1/listen?diarize_model=latest'

Using diarize (legacy)

The boolean diarize parameter continues to work and always routes to the v1 diarizer:

diarize=true

cURL
$curl \
> --request POST \
> --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
> --header 'Content-Type: audio/wav' \
> --data-binary @youraudio.wav \
> --url 'https://api.deepgram.com/v1/listen?diarize=true'

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

Self-hosted deployments: diarize=true is pinned to the v1 batch diarizer. New self-hosted deployments provisioned at the May 2026 release (release-260514) or later receive only the v2 batch diarizer model by default — diarize=true on those deployments returns a successful response without speaker labels, consistent with Deepgram’s longstanding behavior when a requested diarizer model is not present. To produce diarized output on a fresh deployment, specify diarize_model=v2 or diarize_model=latest. See the Self-Hosted May 2026 release notes for details.

Versioning Behavior

To use the new versioned diarizer on batch requests, switch your diarize=true requests to diarize_model (use latest for most cases). Don’t set both diarize and diarize_model — requests that set both are rejected.

Model Compatibility

Diarization is compatible with all Nova batch models (Nova-1, Nova-2, Nova-3) as well as enhanced and base. Whisper is not supported.

Streaming

diarize_model is not accepted on streaming requests and returns 400 regardless of value. For streaming diarization, use the legacy diarize=true parameter, which routes to the v1 streaming diarizer.

Analyze Response

For this example, we use an MP3 audio file that contains the beginning of a customer call with Premier Phone Services. If you would like to follow along, you can download it.

When the file is finished processing, you’ll receive a JSON response. Let’s look more closely at the words object within the alternatives object within this response.

Pre-Recorded

When using diarization for pre-recorded audio, both speaker and speaker_confidence values will be returned:

JSON
1...
2"alternatives":[
3 {
4 ...
5 "words": [
6 {
7 "word":"hello",
8 "start":15.259043,
9 "end":15.338787,
10 "confidence":0.9721591,
11 "speaker":0,
12 "speaker_confidence":0.5853265
13 },
14 ...
15 ]
16 }
17]

Live Streaming

When using diarization for live streaming audio, only the speaker value will be returned:

JSON
1...
2"alternatives":[
3 {
4 ...
5 "words": [
6 {
7 "word":"hello",
8 "start":15.259043,
9 "end":15.338787,
10 "confidence":0.9721591,
11 "speaker":0
12 },
13 ...
14 ]
15 }
16]

Format Response

To improve readability, you can use a JSON processor to parse the JSON. In this example, we use JQ and further improve readability by turning on Deepgram’s punctuation and utterances features:

cURL
$curl \
> --request POST \
> --url 'https://api.deepgram.com/v1/listen?diarize_model=latest&punctuate=true&utterances=true' \
> --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
> --header 'content-type: audio/mp3' \
> --data-binary @Premier_broken-phone_numbers.mp3 | jq -r ".results.utterances[] | \"[Speaker:\(.speaker)] \(.transcript)\""

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

When the file is finished processing, you’ll receive the following response:

[Speaker:0] Hello, and thank you for calling premier phone service. Please be aware that this call may be recorded for quality and training purposes.
[Speaker:0] My name is Beth, and I will be assisting you today. How are you doing?
[Speaker:1] Not too bad. How are you today?
[Speaker:0] I'm doing well. Thank you. May I please have your name?
[Speaker:1] My name is Blake...

To learn more about when to use Deepgram’s Diarization or Multichannel feature, see When to Use the Multichannel and Diarization Features.


What’s Next

  • Understanding When to Use the Multichannel and Diarization Features