Deploy with Terraform
This guide provides a complete Terraform configuration for deploying Deepgram on Amazon SageMaker. The configuration creates an IAM execution role, a SageMaker Model from your AWS Marketplace subscription, an Endpoint Configuration, and a live Endpoint. An optional module adds auto-scaling based on the ConcurrentRequestsPerModel metric.
Before running Terraform, you must subscribe to a Deepgram product on the AWS Marketplace and note the Model Package ARN. See Subscribe to Deepgram Products for instructions.
Prerequisites
- Terraform 1.5 or later
- AWS credentials configured for the target account (via environment variables, shared credentials file, or an IAM role)
- An active AWS Marketplace subscription to a Deepgram SageMaker product
- The Model Package ARN for the subscribed product (found in the SageMaker console under Marketplace Model Packages → AWS Marketplace Subscriptions)
Project layout
Variables
Create variables.tf with the input variables the configuration needs. The only required value is the Model Package ARN from your Marketplace subscription.
Main configuration
Create main.tf with the provider, IAM role, and SageMaker resources. The configuration uses the Model Package ARN from your AWS Marketplace subscription to create the model without referencing a container image directly.
Outputs
Create outputs.tf to surface the endpoint details after terraform apply completes.
Example variable values
Create a terraform.tfvars file with your specific values. Replace the model_package_arn with the ARN from your AWS Marketplace subscription.
Do not commit terraform.tfvars to version control if it contains sensitive values. Add it to .gitignore or use environment variables instead.
Deploy
Preview the resources Terraform will create
Verify the plan shows the expected resources: an IAM role, a SageMaker Model, an Endpoint Configuration, and an Endpoint.
Validate the endpoint
After the endpoint reaches InService, run a test inference to confirm it returns results. See Validate a Deepgram SageMaker Endpoint for the full testing guide using the dg-sagemaker test clients.
Customize the deployment
Instance types
Choose an instance type based on the Deepgram product you are deploying. GPU-accelerated instances are required.
For a full list of compatible instances, see the Deployment Environments hardware specifications.
Environment variable overrides
Pass Deepgram configuration overrides through the deepgram_engine_env and deepgram_api_env variables. Each map key becomes the suffix (for example, "01", "02"), and the value is the TOML expression. See Configure Amazon SageMaker Deployments for the full reference.
VPC configuration
To deploy the endpoint inside a VPC, add a vpc_config block to the aws_sagemaker_model resource:
Tear down
To delete all resources created by this configuration:
This removes the SageMaker Endpoint, Endpoint Configuration, Model, auto-scaling resources (if enabled), and the IAM execution role. You are no longer billed for SageMaker compute after the endpoint is deleted. Your AWS Marketplace subscription remains active.