Validate a Deepgram SageMaker Endpoint
Test a Deepgram SageMaker endpoint using the open-source client scripts in the dg-sagemaker repository, with one script per product and language.
Test a Deepgram SageMaker endpoint using the open-source client scripts in the dg-sagemaker repository, with one script per product and language.
Once your SageMaker endpoint reaches InService, run a client script against it to confirm that inference works end to end. Deepgram publishes a ready-made set of test clients in the dg-sagemaker repository. Each client wraps the SageMaker bidirectional streaming or batch (HTTP) invocation API with the payload shape that the target Deepgram model expects, so you can focus on verifying the endpoint rather than writing transport code.
The repository is organized by product and by language. Pick the script that matches the model you deployed (Flux, Nova-3, or Aura) and the language you want to work in (Python, Node.js, or Java). The scripts are load-testing clients first and functional smoke tests second — running any of them with a single connection is the fastest way to prove that your endpoint accepts traffic and returns results.
The scripts evolve independently. Flags, defaults, and input formats differ between products and between languages. Always read the README.md inside the subdirectory you intend to run before invoking a script.
Install the language runtime and package manager for the script you plan to run. See Required tooling.
Configure AWS credentials locally and ensure your shell targets the region where the SageMaker endpoint is deployed. See AWS credentials and region.
Identify the subdirectory for your Deepgram product and language. See Pick the right script.
InService status. See Deploy Deepgram on Amazon SageMaker.sagemaker:InvokeEndpointsagemaker:InvokeEndpointWithResponseStreamsagemaker:InvokeEndpointWithBidirectionalStreamsagemaker:ListEndpoints and sagemaker:DescribeEndpoint (used by list-endpoints helpers)git available on your local machine.Each Deepgram product has a dedicated script because the SageMaker payload shape and the protocol on top of it differ per model. Run the script that matches the model you deployed.
Invocation flags, defaults, and required inputs vary per script. Consult the README.md in each subdirectory for the full command reference. For example, the Flux client uses the /v2/listen endpoint and a turn-based protocol, while the Nova-3 client uses /v1/listen with channel-based alternatives.
Install only the tools you need for the script you plan to run.
python-stt)python-tts and for microphone modes of python-stt/python-flux):
brew install portaudiosudo apt-get install portaudio19-devInstall dependencies inside the script’s subdirectory:
The scripts use the standard AWS credential chain — environment variables, shared credentials file, or an attached IAM role. Configure credentials with whichever mechanism you prefer:
Or export them for the current shell:
Verify the identity the scripts will use:
Most scripts accept a --region flag and default to us-east-1 (or us-east-2 for python-tts). Make sure the region you pass matches the region where your SageMaker endpoint was deployed — an endpoint name is only resolvable within the region it was created in.
If you are not sure which endpoints exist in a region, use the Flux client’s helper subcommand to list them:
Once tooling and credentials are in place, invoke the script for your product. The example below shows the simplest single-connection invocation for the Flux Python client. The Nova-3 and Aura clients follow a similar pattern but with different flags, defaults, and input formats — refer to the dg-sagemaker repository and each subdirectory’s README.md for the full command reference.
For best compatibility, use a 16-bit PCM WAV file. Convert other formats with:
A successful test produces:
TurnInfo events) are logged to the console.If the script hangs, errors on connection, or fails to authenticate, see Troubleshooting.
ResourceNotFound or endpoint lookup failures — confirm the endpoint name and that your shell is targeting the correct region. Run aws sagemaker describe-endpoint --endpoint-name your-endpoint-name --region your-region to verify.AccessDeniedException — confirm your IAM identity has sagemaker:InvokeEndpoint (and the bidirectional/streaming variants) for the endpoint’s ARN.InService. Deployment typically takes several minutes.ffmpeg command as shown above.