Validate a Deepgram SageMaker Endpoint
Test a Deepgram SageMaker endpoint using the open-source client scripts in the dg-sagemaker repository, with one script per product and language.
Once your SageMaker endpoint reaches InService, run a client script against it to confirm that inference works end to end. Deepgram publishes a ready-made set of test clients in the dg-sagemaker repository. Each client wraps the SageMaker bidirectional streaming or batch (HTTP) invocation API with the payload shape that the target Deepgram model expects, so you can focus on verifying the endpoint rather than writing transport code.
The repository is organized by product and by language. Pick the script that matches the model you deployed (Flux, Nova-3, or Aura) and the language you want to work in (Python, Node.js, or Java). The scripts are load-testing clients first and functional smoke tests second — running any of them with a single connection is the fastest way to prove that your endpoint accepts traffic and returns results.
The scripts evolve independently. Flags, defaults, and input formats differ between products and between languages. Always read the README.md inside the subdirectory you intend to run before invoking a script.
High-level process
Install the required tooling
Install the language runtime and package manager for the script you plan to run. See Required tooling.
Configure AWS credentials and region
Configure AWS credentials locally and ensure your shell targets the region where the SageMaker endpoint is deployed. See AWS credentials and region.
Choose the script that matches your product
Identify the subdirectory for your Deepgram product and language. See Pick the right script.
Prerequisites
- A Deepgram SageMaker endpoint in
InServicestatus. See Deploy Deepgram on Amazon SageMaker. - AWS IAM permissions to invoke the endpoint and, optionally, to list endpoints in the target region:
sagemaker:InvokeEndpointsagemaker:InvokeEndpointWithResponseStreamsagemaker:InvokeEndpointWithBidirectionalStreamsagemaker:ListEndpointsandsagemaker:DescribeEndpoint(used bylist-endpointshelpers)
gitavailable on your local machine.
Pick the right script
Each Deepgram product has a dedicated script because the SageMaker payload shape and the protocol on top of it differ per model. Run the script that matches the model you deployed.
Invocation flags, defaults, and required inputs vary per script. Consult the README.md in each subdirectory for the full command reference. For example, the Flux client uses the /v2/listen endpoint and a turn-based protocol, while the Nova-3 client uses /v1/listen with channel-based alternatives.
Required tooling
Install only the tools you need for the script you plan to run.
Python
Node.js
Java
- Python 3.12+ (Python 3.14+ for
python-stt) - uv package manager
- PortAudio for microphone or audio playback (required for
python-ttsand for microphone modes ofpython-stt/python-flux):- macOS:
brew install portaudio - Linux:
sudo apt-get install portaudio19-dev
- macOS:
Install dependencies inside the script’s subdirectory:
AWS credentials and region
The scripts use the standard AWS credential chain — environment variables, shared credentials file, or an attached IAM role. Configure credentials with whichever mechanism you prefer:
Or export them for the current shell:
Verify the identity the scripts will use:
Most scripts accept a --region flag and default to us-east-1 (or us-east-2 for python-tts). Make sure the region you pass matches the region where your SageMaker endpoint was deployed — an endpoint name is only resolvable within the region it was created in.
If you are not sure which endpoints exist in a region, use the Flux client’s helper subcommand to list them:
Run the script
Once tooling and credentials are in place, invoke the script for your product. The example below shows the simplest single-connection invocation for the Flux Python client. The Nova-3 and Aura clients follow a similar pattern but with different flags, defaults, and input formats — refer to the dg-sagemaker repository and each subdirectory’s README.md for the full command reference.
For best compatibility, use a 16-bit PCM WAV file. Convert other formats with:
Verify the result
A successful test produces:
- Speech-to-text (Flux, Nova-3): transcript payloads streamed back to the client. Final transcripts and interim results (or Flux
TurnInfoevents) are logged to the console. - Text-to-speech (Aura): audible speech playback from the selected connection and a steady stream of synthesized audio chunks logged to the console.
If the script hangs, errors on connection, or fails to authenticate, see Troubleshooting.
Troubleshooting
ResourceNotFoundor endpoint lookup failures — confirm the endpoint name and that your shell is targeting the correct region. Runaws sagemaker describe-endpoint --endpoint-name your-endpoint-name --region your-regionto verify.AccessDeniedException— confirm your IAM identity hassagemaker:InvokeEndpoint(and the bidirectional/streaming variants) for the endpoint’s ARN.- Endpoint not yet in service — wait until the endpoint status is
InService. Deployment typically takes several minutes. - Audio format errors — try changing the input audio format to 16-bit PCM WAV, instead of other compressed formats. Use the
ffmpegcommand as shown above. - Server-side errors — check the Amazon CloudWatch Logs for the SageMaker endpoint to diagnose container-level issues.