For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Deepgram can be deployed into your own Amazon Virtual Private Cloud (VPC) environment using Amazon SageMaker AI. Simply subscribe to the Deepgram product in the AWS Marketplace and then deploy a SageMaker Endpoint, using our pre-made SageMaker Model Package.
For an overview of running Deepgram on SageMaker, including benefits, tradeoffs, and pricing, see Amazon SageMaker.
Supported Products
Follow this AWS Marketplace link to see the Deepgram products that are supported on the SageMaker AI platform. No login to your AWS account is required to view this public AWS Marketplace website.
Each transcription language model is published as a separate product listing. Please subscribe to and deploy SageMaker Endpoints for each language model that you wish to utilize. Your application code will need to route to your SageMaker Endpoint for the language model you wish to run inference against.
Language Requests: If there is a transcription language that is not currently listed on the AWS Marketplace, please work with your account manager to request additional language models to be added. For a full list of the Deepgram supported transcription languages, check out this document. You can also view the Changelog to see recent product announcements.
For Text-to-Speech (TTS) models, we also publish separate language synthesis models under separate AWS Marketplace product listings. You can choose which TTS languages you’d like to subscribe to and deploy as SageMaker Endpoints into your AWS account.
Limitations
When using Deepgram services in Amazon SageMaker, please be aware of the following limitations.
Deepgram cannot call Large Langage Model (LLM) services
Passing a JSON payload for transcription (e.g., referencing a file stored in cloud storage via URL) is unsupported, as the SageMaker isolation model prevents the container from reaching out to external cloud storage
Deepgram custom metrics are not currently available through Amazon SageMaker Endpoints
The connection remains open until you explicitly close the input stream or the endpoint closes the connection, supporting up to 30 minutes of connection time.
Before you can deploy Deepgram on Amazon SageMaker AI, you’ll need to subscribe to the product in the AWS Marketplace.
Keep in mind that you are not billed for the product until you deploy an Amazon SageMaker AI Endpoint resource.
Follow the AWS documentation to create an AWS Identity & Access Management (IAM) role that will be used to run SageMaker Model Endpoints.
You only need to create a single SageMaker execution role, and can reuse this IAM Role to deploy multiple SageMaker Endpoints.
Deploy Deepgram Model Package for SageMaker AI
Once you’ve subscribed to the Deepgram product on AWS Marketplace, you can deploy a SageMaker AI Endpoint.
The SageMaker “Endpoint” resource represents the compute instance that runs the Deepgram Voice AI services.
It will take several minutes to deploy a SageMaker Endpoint, once you initiate the resource creation.
Click the Submit button, to create the SageMaker AI Endpoint
After following these steps, you should see a new Endpoint in your AWS account.
If you don’t see the Endpoint, ensure that you have selected the correct AWS region in the AWS Management Console.
It may take several minutes for the Endpoint to change to status InService.
Once the Endpoint status has changed to InService, you can monitor the Amazon CloudWatch Logs for the Endpoint to ensure normal operation of the Deepgram services.
Inference
Speech-to-Text (STT) Streaming
Once you’ve deployed the Deepgram services as a SageMaker Endpoint, you can run streaming inference against the Endpoint using a supported AWS Software Development Kit (SDK).
This capability requires the InvokeEndpointWithBidirectionalStream API in the Amazon SageMaker AI service.
The Deepgram WebSocket payloads do not change in SageMaker, however they will be wrapped in an additional data structure required by the Amazon SageMaker API.
For example, to send an array of bytes as an audio payload to the Deepgram STT streaming API, a WebSocket KeepAlive, or CloseStream message, you would use the following payloads.
SageMaker Bidirectional Payload Examples
1
// Deepgram STT streaming on Amazon SageMaker audio chunk
2
PayloadPart: {
3
Bytes: [0, 255, 128, 64, 250, 5],
4
DataType: "BINARY",
5
},
6
7
// Deepgram STT streaming on Amazon SageMaker "KeepAlive" message
8
PayloadPart: {
9
Bytes: new TextEncoder().encode(JSON.stringify({
10
type: "KeepAlive",
11
})),
12
DataType: "UTF8",
13
},
14
15
// Deepgram STT streaming on Amazon SageMaker "CloseStream" message
16
PayloadPart: {
17
Bytes: new TextEncoder().encode(JSON.stringify({
18
type: "CloseStream",
19
})),
20
DataType: "UTF8",
21
},
Let’s walk through the code to run inference against the Deepgram Speech-to-Text (STT) streaming transcription endpoint in SageMaker AI.
First, import the necessary items from the Amazon SageMaker AI runtime API.
You’ll also need to implement the createWebSocketStream function, which will be a generator function that will continuously yield WebSocket client messages to the Deepgram API.
This function could read a live audio stream from a microphone, using the @mastra/node-audio package, or stream a local WAV file, in chunks.
TypeScript
1
async function* createWebSocketStream() {
2
// Read microphone or WAV file in chunks and continuously yield
3
while (true) {
4
yield {
5
PayloadPart: {
6
// The raw audio bytes should be provided as a Uint8Array
7
Bytes: new Uint8Array([0,1,2,3,4,5]),
8
DataType: "BINARY",
9
},
10
}
11
}
12
}
Here is a complete example:
TypeScript Complete Example
1
// Sample script to capture microphone input and stream to Amazon SageMaker bidirectional
2
// streaming endpoint with Deepgram transcription Voice AI models.
console.error('Deployment process failed:', error);
189
throw error;
190
}
191
}
192
193
declare const require: any;
194
declare const module: any;
195
declare const process: any;
196
197
if (typeof require !== 'undefined' && require.main === module) {
198
main().catch(error => {
199
console.error('Script execution failed:', error);
200
if (typeof process !== 'undefined') {
201
process.exit(1);
202
}
203
});
204
}
205
206
export {
207
invokeEndpointWithBidirectionalStream,
208
config,
209
bidiEndpoint
210
};
Troubleshooting
If you’re experiencing any issues with your Deepgram deployment on Amazon SageMaker AI, you can obtain the Deepgram container logs from the Amazon CloudWatch service.
If you open the SageMaker AI Endpoint resource details, there will be a link to open the Amazon CloudWatch Log Group for that endpoint.
Within the CloudWatch Log Group, there should be a Log Stream that contains the Deepgram logs for all components.
You can use the Amazon CloudWatch Logs Live Tail feature to watch logs in near-real-time
while you are sending requests to the Deepgram API, via the SageMaker AI APIs.
To use the CloudWatch Logs Live Tail feature locally, from the AWS CLI tool, you can use the following command.
If you experience any issues using Deepgram services running on the Amazon SageMaker AI platform, please review this checklist before contacting Deepgram support.
Ensure that your application’s AWS IAM User or IAM Role has permission to call the InvokeEndpointWithBidirectionalStream SageMaker AI action.
Ensure your application is targeting the correct AWS account and region, where your SageMaker Endpoint exists.
Ensure the Deepgram product you’ve deployed (eg. streaming Speech-to-Text), from the AWS Marketplace, corresponds to the Deepgram API you’re calling.
There is a known compatibility issue using pre-Blackwell NVIDIA GPUs with the latest SageMaker-provided AMI named al2023-ami-sagemaker-inference-gpu-4-1 which includes the NVIDIA 580 driver version. When creating your SageMaker Endpoint Configuration resource, using a g4dn, g5, g6, or g6e instance family, please be sure that you are using one of the AMIs before this version. You can also reference this AWS supported configurations table.
If you have received a SageMaker private offer for a management account of an AWS organization, you may use AWS License Manager to grant usage of the SageMaker private offer to member accounts within your AWS organization as a Marketplace license entitlement.