Configure

Send a Configure message to update Flux stream settings in real-time without reconnecting.

Streaming:Flux

Introduction

Real conversations aren’t static. A call that starts with casual confirmation (“Can you verify your name?”) shifts to strict authentication (“Please say your 6-digit PIN”) and then to open-ended troubleshooting. Conversations evolve through discrete sections, intents, and steps—each with different demands on your speech recognition system.

The Configure control message enables you to adapt Flux’s behavior mid-stream as conversational context evolves, without disconnecting and reconnecting. This is essentially context injection for speech recognition: you inject the specific vocabulary, turn detection behavior, and timing parameters needed for each phase of the conversation.

Why This Matters for Voice Agents

The ASR behavior you want at minute one isn’t what you want at minute three. With dynamic configuration, you can:

Dynamically bias toward task-critical phrases. Collecting a customer’s name? Add it to keyterms right before you ask. Moving from appointment scheduling to pharmacy? Swap in medication names and medical terminology. Handling a product inquiry? Load the specific product names and feature terminology relevant to that conversation. You’re no longer stuck with a generic keyterm list that’s “good enough” for the whole call or loading hundreds of irrelevant terms upfront.

Adjust turn detection for critical flows. When you’re collecting a password, OTP, or account number, you don’t want Flux cutting off the user mid-utterance. Increase eot_timeout_ms and eot_threshold values for that segment to allow longer pauses and wait for higher confidence before detecting turn end, then decrease them when you’re back to natural conversation.

Reduce engineering complexity. Without dynamic configuration, changing ASR behavior mid-call meant reconnecting (dropping audio, managing state transitions) or worse, managing multiple concurrent streams and swapping between them. That’s a state machine you never wanted to build and definitely don’t want to maintain. Configure gives you one connection with dynamic behavior.

Configuration updates are processed in order with your audio stream and take effect immediately when processed. The stream continues uninterrupted, and you receive confirmation of successful updates via ConfigureSuccess messages.

Configurable Parameters

You can update the following parameters mid-stream:

ParameterTypeRangeDescription
keytermsarrayUp to 100 termsCustom vocabulary terms to boost recognition accuracy. Note: Sending keyterms replaces the entire list, not merge.
eot_thresholdnumber0.5-0.9Confidence threshold for standard turn detection. Higher values mean more confidence required before detecting turn end.
eager_eot_thresholdnumber0.3-0.9Confidence threshold for eager turn detection. Must be ≤ eot_threshold.
eot_timeout_msnumber500-10000Maximum silence duration (in milliseconds) before forcing turn end.

All parameters are optional in a Configure message. Omitted parameters retain their current values.

Message Structure

Configure Message

Thresholds must be nested under a "thresholds" object. Individual threshold properties can be sent without including all three.

1{
2 "type": "Configure",
3 "thresholds": {
4 "eot_threshold": 0.8,
5 "eot_timeout_ms": 5000
6 }
7}

Response Messages

ConfigureSuccess

Returned when configuration update is successfully applied. Echoes back the updated configuration.

1{
2 "type": "ConfigureSuccess",
3 "thresholds": {
4 "eager_eot_threshold": 0.4,
5 "eot_threshold": 0.7,
6 "eot_timeout_ms": 6000
7 },
8 "keyterms": ["apple", "banana", "orange"]
9}

ConfigureFailure

Returned when configuration update fails validation. The stream continues with the previous configuration.

1{
2 "type": "ConfigureFailure",
3 "sequence_id": 42,
4 "code": "INVALID_THRESHOLD",
5 "description": "eager_eot_threshold must be less than or equal to eot_threshold"
6}

Important Behaviors

Configuration Update Timing

Key timing behaviors:

  • Updates apply immediately when the Configure message is processed
  • Updates persist until the stream ends or another Configure message is sent
  • Turn boundaries do not affect when updates take effect
  • Already-transcribed audio is NOT reprocessed with new configuration

Keyterm Overwrite Behavior

Critical: When sending a Configure message with keyterms, the ENTIRE keyterms list is replaced, not merged. If you want to add terms, you must include both existing and new terms.

Example:

Initial keyterms: ["apple", "banana", "orange"]
Configure with: {"keyterms": ["grape", "kiwi"]}
Result: ["grape", "kiwi"]
// "apple", "banana", "orange" are REMOVED

To add terms while keeping existing ones, retrieve the current keyterms first (via application state tracking or the initial configuration), then send a Configure message with the combined list.

Exclusion vs. Clearing

Different behaviors apply when you omit fields versus explicitly clearing them:

ScenarioJSON ExampleBehavior
Omit keyterms{"type": "Configure", "thresholds": {...}}No change to keyterms
Empty keyterms array{"type": "Configure", "keyterms": []}Clears all keyterms
Omit threshold property{"thresholds": {"eot_threshold": 0.8}}No change to other thresholds
Omit entire thresholds object{"type": "Configure", "keyterms": [...]}No change to any thresholds

Validation Rules

Configure messages are validated using the same rules as initial connection parameters:

  • eager_eot_threshold must be ≤ eot_threshold (if both are specified in the message)
  • Threshold values must be within valid ranges
  • Keyterms array must contain ≤ 100 terms

Important: A failed Configure message (returning ConfigureFailure) does NOT affect the stream. The connection continues with the previous configuration unchanged.