For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Ask AIPlaygroundLoginFree API Key
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
    • Getting Started with Speech to Text
  • Pre-Recorded Audio
    • Getting Started
    • Feature Overview
    • Template Apps
  • Streaming Audio
      • Getting Started
      • Feature Overview
      • Template Apps
      • End-of-Turn Configuration
      • Flux Multilingual & Language Prompting
      • Build a Flux-enabled Voice Agent
      • Why Flux's State Machine Matters
        • Migrating from Nova-3 to Flux
    • Compare Flux to Nova-3
  • Models and Languages
    • Models & Languages Overview
    • Languages Support
    • Language Detection
    • Multilingual Codeswitching
    • Model Options
    • Version
  • Formatting
    • Speaker Diarization
    • Dictation
    • Filler Words
    • Measurements
    • Numerals
    • Paragraphs
    • Profanity Filtering
    • Punctuation
    • Redaction
    • Smart Formatting
    • Supported Entity Types
    • Utterances
    • Utterance Split
  • Custom Vocabulary
    • Find and Replace
    • Keyterm Prompting
    • Keywords
    • Search
  • Media Input Settings
    • Channels
    • Encoding
    • Multichannel
    • Sample Rate
  • Results Processing
    • Understanding Word Confidence Scores
    • STT Callback
    • STT Tagging
    • Extra Metadata
  • Migrating
    • Migrating From Amazon Web Services (AWS) Transcribe to Deepgram
    • Migrating From Google Speech-to-Text (STT) to Deepgram
    • Migrating From OpenAI Whisper to Deepgram
    • Migrating from AssemblyAI Speech-to-Text to Deepgram
LogoLogo
Ask AIPlaygroundLoginFree API Key
On this page
  • Key Benefits of Flux
  • Audio Requirements
  • Audio Format Requirements
  • Migrating from Nova 3 to Flux
  • Differences
  • Endpoint Usage
  • Response Message Structure
  • Nova 3
  • Flux
  • Implementation Pattern Changes
  • Nova 3 Approach
  • Flux Approach
  • Simple Approach: Enabling End of Turn
  • Example
  • Optimized Approach: Enabling EagerEndOfTurn + EndOfTurn
  • Example
  • Nova 3 Migration Checklist
Streaming AudioConversational STT for Voice Agents (Flux)Migrating

Migrating from Nova-3 to Flux

Migrate from Nova-3 to Flux, Deepgram’s conversational speech recognition purpose-built for interactive voice agents.

Was this page helpful?
Previous

Optimize Voice Agent Latency with Eager End of Turn

Reduce end-to-end latency by preparing responses early with Eager End of Turn events.

Next
Built with

Key Benefits of Flux

  • Model-integrated turn detection (StartOfTurn, EagerEndOfTurn, TurnResumed, EndOfTurn)
  • Ultra-low latency ~260ms end-of-turn detection (p50 at defaults)
  • EagerEndOfTurn events let you start LLM responses early
  • Turn-based transcripts for clean agent logic
  • Same Nova 3 transcription quality
  • Simplified development one API replaces complex STT+VAD+endpointing pipelines, and conversation-native events.
  • High configurability - Configurable end-of-turn detection sensitivity, eager response thresholds, and turn-taking dynamics for optimized conversational flow

Audio Requirements

  • Encoding: See Audio Format Requirements table below
  • Sample rates: See Audio Format Requirements table below
  • Channels: Mono only
  • Chunk size: 80ms strongly recommended for optimal model performance and latency.

Audio Format Requirements

Audio TypeEncodingContainerencoding paramsample_rate paramSupported Sample Rates
Rawlinear16, linear32, mulaw, alaw, opus, ogg-opusNoneRequiredRequired8000, 16000, 24000, 44100, 48000
Containerizedlinear16WAVOmitOmitAuto-detected from container
ContainerizedopusOggOmitOmitAuto-detected from container
ContainerizedopusWebMOmitOmitAuto-detected from container

Migrating from Nova 3 to Flux

This guide will help you migrate from Nova 3 to Flux by highlighting key differences, setup changes, and implementation patterns.

Differences

Nova 3Flux
Streams transcripts continuouslyEmits structured turn events
Requires custom logic for barge-in and turn-takingHas built-in turn state machine
Returns transcripts onlyReturns conversation events and transcripts
Designed for general real-time transcriptionDesigned for conversational voice agents
Focuses on accuracy and speedFocuses on accuracy and turn awareness

Endpoint Usage

Nova 3:

Uses the listen v1 endpoint with the nova-3 model option.

wss://api.deepgram.com/v1/listen?model=nova-3

Flux:

Uses the listen v2 endpoint with the flux-general-en model option.

wss://api.deepgram.com/v2/listen?model=flux-general-en

Response Message Structure

Nova 3

1{
2 "type": "Results",
3 "channel": "transcript",
4 "alternatives": [...]
5}

Flux

1{
2 "type": "TurnInfo",
3 "request_id": "2ba892a1-6c0d-4d92-9b89-0000000000",
4 "event": "Update",
5 "turn_index": 0,
6 "audio_window_start": 0,
7 "audio_window_end": 0.47999996,
8 "transcript": "",
9 "words": [...],
10 "end_of_turn_confidence": 0.0009,
11 "sequence_id": 2
12}

In addition to the transcript, flux responses include the:

  • event field for turn-state changes
  • turn_index to track turn lifecycle
  • audio_window_start and audio_window_end to track the audio window.
  • end_of_turn_confidence to track the confidence of the end of turn.
  • sequence_id to track the sequence id of the messages.

Implementation Pattern Changes

Nova 3 Approach

Requires custom logic for barge-in and turn-taking.

  • Send audio
  • Receive streaming partial transcripts
  • Decide when to interrupt your agent manually

Flux Approach

Listens for structured events and removes the need for custom VAD or barge-in logic.

  • StartOfTurn: Interrupt agent if it’s speaking
  • EagerEndOfTurn: Medium-confidence end → start LLM reply
  • TurnResumed: User kept talking → cancel reply
  • EndOfTurn: High-confidence end → send transcript to LLM

By default, Flux only emits Update, StartOfTurn, and EndOfTurn.

Simple Approach: Enabling End of Turn

For more information on using Flux with EndOfTurn only see the Flux Getting Started Guide

This is a simple approach using only EndOfTurn (lower latency, less complex, less LLM calls).

To enable end of turn use the eot_threshold parameter which allows for a confidence of (0.5–0.9) for EndOfTurn events.

Example

1wss://api.deepgram.com/v2/listen?model=flux-general-en&sample_rate=16000&encoding=linear16&eot_threshold=0.8

Optimized Approach: Enabling EagerEndOfTurn + EndOfTurn

This is an optimized approach using both EagerEndOfTurn and EndOfTurn (lower latency, slightly more complex, more LLM calls)

To enable eager end of turn use the eager_eot_threshold parameter which allows for a Confidence of (0.3–0.9). You can also set the eot_threshold with a confidence of (0.5–0.9) to handle EndOfTurn events and use the eot_timeout_ms which defaults to 5000 ms to force a timeout after a specified time.

Example

1wss://api.deepgram.com/v2/listen?model=flux-general-en&sample_rate=16000&encoding=linear16&eager_eot_threshold=0.6&eot_threshold=0.8&eot_timeout_ms=7000

Nova 3 Migration Checklist

  • Update WebSocket endpoint to /v2/listen
  • Set model=flux-general-en and encoding=linear16
  • Adjust client to parse TurnInfo messages
  • Implement turn event handling (start, eager end of turn, turn resumed, end)
  • Tune eager_eot_threshold and eot_threshold for your use case
  • Remove custom VAD/barge-in logic (Flux handles this natively!)