Rate limits vary by region. North America and Europe limits are shown separately for each service.
api.deepgram.comapi.eu.deepgram.comLimits to consider if you use the Pay as You Go plan with Deepgram.
Rate limits apply per project, not per account or API key. Creating additional projects under the same account will not grant you additional concurrency. Secondary projects created on a self-serve account are limited to a single concurrent stream by design. Bypassing rate limits by spreading traffic across multiple projects violates our Terms of Service.
If you need higher concurrency, contact sales about a growth or enterprise agreement.
If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.
If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.
If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.
Limits to consider if you use the Growth plan with Deepgram.
If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.
If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.
If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.
Starting limits to consider if you have an Enterprise Contract with Deepgram. Enterprise limits are the same across all regions.
New and existing Enterprise customers can request a Service Limit increase by discussing your needs with the Deepgram Sales Team.
If multiple services are used in one API call (e.g Speech to Text + Sentiment Analysis), the lower of the rate limits is applied.
If you include Speaker Diarization features in requests to /listen, you will be subject to the service limits noted in the table below.
If you include Audio Intelligence features in requests to /listen, you will be subject to the service limits noted in the table below.
Rate limits are scoped to a project. One customer maps to one project for the purposes of these limits. Multiple API keys inside a single project all draw from that project’s concurrency pool. Adding more projects under the same account does not increase the concurrency available to you.
Consolidate your traffic into a single project, then contact sales about moving to a Growth or Enterprise agreement with a higher concurrency allocation.
Do not create additional projects, additional accounts, or otherwise distribute traffic across projects to work around per-project limits. These setups are detected. Secondary projects on a self-serve account are restricted to 1 concurrent stream, and bypassing rate limits this way violates our Terms of Service.
The concurrency limits on this page apply to inference endpoints (Speech-to-Text, Text-to-Speech, Voice Agent, Intelligence). REST endpoints for project and API key management have their own separate rate limits. See Temporary API Key Limits for details.