API Rate Limits
Rate limits vary by region. North America and Europe limits are shown separately for each service.
- North America:
api.deepgram.com - Europe:
api.eu.deepgram.com
Pay as You Go
Limits to consider if you use the Pay as You Go plan with Deepgram.
Rate limits apply per project
Rate limits apply per project, not per account or API key. Creating additional projects under the same account will not grant you additional concurrency. Secondary projects created on a self-serve account are limited to a single concurrent stream by design. Bypassing rate limits by spreading traffic across multiple projects violates our Terms of Service.
If you need higher concurrency, contact sales about a growth or enterprise agreement.
Voice Agent
Speech to Text
If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.
If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.
Text to Speech REST
Text to Speech Streaming
Audio Intelligence
If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.
Text Intelligence
Growth
Limits to consider if you use the Growth plan with Deepgram.
Voice Agent
Speech to Text
If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.
If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.
Text to Speech REST
Text to Speech Streaming
Audio Intelligence
If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.
Text Intelligence
Enterprise
Starting limits to consider if you have an Enterprise Contract with Deepgram. Enterprise limits are the same across all regions.
New and existing Enterprise customers can request a Service Limit increase by discussing your needs with the Deepgram Sales Team.
Voice Agent
Speech to Text
If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.
If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.
Text to Speech REST
Text to Speech Streaming
Audio Intelligence
If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.
Text Intelligence
Scaling beyond default limits
Rate limits are scoped to a project. One customer maps to one project for the purposes of these limits. Multiple API keys inside a single project all draw from that project’s concurrency pool. Adding more projects under the same account does not increase the concurrency available to you.
If you are hitting your limits
Consolidate your traffic into a single project, then contact sales about moving to a Growth or Enterprise agreement with a higher concurrency allocation.
What not to do
Do not create additional projects, additional accounts, or otherwise distribute traffic across projects to work around per-project limits. These setups are detected. Secondary projects on a self-serve account are restricted to 1 concurrent stream, and bypassing rate limits this way violates our Terms of Service.
REST API rate limits
The concurrency limits on this page apply to inference endpoints (Speech-to-Text, Text-to-Speech, Voice Agent, Intelligence). REST endpoints for project and API key management have their own separate rate limits. See Temporary API Key Limits for details.