API Rate Limits

Understand the different service limits of Deepgram's APIs.

Pay as You Go

Limits to consider if you use the Pay as You Go plan with Deepgram.

Voice Agent

APIConnection Limits
Voice Agent APIUp to 45 concurrent connections

Speech to Text

If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.

ModelService Limit
FluxStreaming Up to 150 concurrent requests
Nova-3Pre-Recorded Up to 50 concurrent requests
Streaming Up to 150 concurrent requests
Nova-2Pre-Recorded Up to 50 concurrent requests
Streaming Up to 150 concurrent requests
NovaPre-Recorded Up to 50 concurrent requests
Streaming Up to 150 concurrent requests
EnhancedPre-Recorded Up to 50 concurrent requests
Streaming Up to 150 concurrent requests
BasePre-Recorded Up to 50 concurrent requests
Streaming Up to 150 concurrent requests
Whisper CloudPre-Recorded Up to 3 concurrent requests

If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.

ModelService Limit
Speaker DiarizationPre-Recorded Up to 50 concurrent requests
Streaming Up to 50 concurrent requests

Text to Speech REST

ModelService Limit
AuraUp to 15 concurrent requests
Aura-2Up to 15 concurrent requests

Text to Speech Streaming

ModelService Limit
AuraUp to 45 concurrent requests
Aura-2Up to 45 concurrent requests

Audio Intelligence

If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.

ModelService Limit
Intent RecognitionUp to 10 concurrent requests
Entity DetectionUp to 5 concurrent requests
Sentiment AnalysisUp to 10 concurrent requests
SummarizationUp to 10 concurrent requests
Topic DetectionUp to 10 concurrent requests

Text Intelligence

ModelService Limit
Intent RecognitionUp to 10 concurrent requests
Sentiment AnalysisUp to 10 concurrent requests
SummarizationUp to 10 concurrent requests
Topic DetectionUp to 10 concurrent requests

Growth

Limits to consider if you use the Growth plan with Deepgram.

Voice Agent

APIConnection Limits
Voice Agent APIUp to 60 concurrent connections

Speech to Text

If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.

ModelService Limit
FluxStreaming Up to 225 concurrent requests
Nova-3Pre-Recorded Up to 50 concurrent requests
Streaming Up to 225 concurrent requests
Nova-2Pre-Recorded Up to 50 concurrent requests
Streaming Up to 225 concurrent requests
NovaPre-Recorded Up to 50 concurrent requests
Streaming Up to 225 concurrent requests
EnhancedPre-Recorded Up to 50 concurrent requests
Streaming Up to 225 concurrent requests
BasePre-Recorded Up to 50 concurrent requests
Streaming Up to 225 concurrent requests
Whisper CloudPre-Recorded Up to 3 concurrent requests

If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.

ModelService Limit
Speaker DiarizationPre-Recorded Up to 50 concurrent requests
Streaming Up to 50 concurrent requests

Text to Speech REST

ModelService Limit
AuraUp to 15 concurrent requests
Aura-2Up to 15 concurrent requests

Text to Speech Streaming

ModelService Limit
AuraUp to 60 concurrent requests
Aura-2Up to 60 concurrent requests

Audio Intelligence

If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.

ModelService Limit
Intent RecognitionUp to 10 concurrent requests
Entity DetectionUp to 5 concurrent requests
Sentiment AnalysisUp to 10 concurrent requests
SummarizationUp to 10 concurrent requests
Topic DetectionUp to 10 concurrent requests

Text Intelligence

ModelService Limit
Intent RecognitionUp to 10 concurrent requests
Sentiment AnalysisUp to 10 concurrent requests
SummarizationUp to 10 concurrent requests
Topic DetectionUp to 10 concurrent requests

Enterprise

Starting limits to consider if you have an Enterprise Contract with Deepgram.

New and existing Enterprise customers can request a Service Limit increase by discussing your needs with the Deepgram Sales Team.

Voice Agent

APIConnection Limits
Voice Agent APIStarting at 100 concurrent connections

Speech to Text

If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.

ModelService Limit
FluxStreaming Up to 300 concurrent requests
Nova-3Pre-Recorded Starting at 200 concurrent requests
Streaming Starting at 300 concurrent requests
Nova-2Pre-Recorded Starting at 200 concurrent requests
Streaming Starting at 300 concurrent requests
NovaPre-Recorded Starting at 200 concurrent requests
Streaming Starting at 300 concurrent requests
EnhancedPre-Recorded Starting at 200 concurrent requests
Streaming Starting at 300 concurrent requests
BasePre-Recorded Starting at 200 concurrent requests
Streaming Starting at 300 concurrent requests
Whisper CloudPre-Recorded Starting at 15 concurrent requests

If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.

ModelService Limit
Speaker DiarizationPre-Recorded Up to 100 concurrent requests
Streaming Up to 100 concurrent requests

Text to Speech REST

ModelService Limit
AuraStarting at 25 concurrent requests
Aura-2Starting at 25 concurrent requests

Text to Speech Streaming

ModelService Limit
AuraStarting at 100 concurrent requests
Aura-2Starting at 100 concurrent requests

Audio Intelligence

If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.

ModelService Limit
Intent RecognitionStarting at 10 concurrent requests
Entity DetectionStarting at 10 concurrent requests
Sentiment AnalysisStarting at 10 concurrent requests
SummarizationStarting at 20 concurrent requests
Topic DetectionStarting at 10 concurrent requests

Text Intelligence

ModelService Limit
Intent RecognitionStarting at 10 concurrent requests
Sentiment AnalysisStarting at 10 concurrent requests
SummarizationStarting at 20 concurrent requests
Topic DetectionStarting at 10 concurrent requests