Amazon Web Services
Deploying Deepgram on Amazon Web Services (AWS) requires some preparation. In this section, you will learn how to provision a managed Kubernetes Cluster where you will deploy Deepgram products. You will need to perform some of these steps in the AWS Management Console and some in your local terminal.
Prerequisites
Make sure you have completed the requirements in the Self-Hosted Introduction.
kubectl
The Kubernetes command-line tool, kubectl, allows you to run commands against Kubernetes clusters. You can use kubectl to deploy applications, inspect and manage cluster resources, and view logs.
Install locally using the official Kubernetes guides .
AWS CLI
The AWS CLI provides programmatic access to manage your AWS services. Certain steps in this guide are enabled by this tool, although many of the same actions can be performed manually in the AWS Console.
-
Follow the installation guide to install the CLI locally.
-
Once installed, follow the setup guide to configure the CLI with access to your AWS account. When configuring, set the default region to
us-west-2.Choosing a Region
The templates and steps in this guide provision resources in the AWSus-west-2region.If you would like to deploy to a different region, make sure to specify your desired region when running
aws configure, and adjust templates and steps in this guide accordingly.
Cluster Management with eksctl
eksctl is the official CLI for Amazon EKS. It simplifies creating and managing clusters by creating subnets, managed node groups, service accounts, and other resources to integrate with your cluster.
Certain steps in this guide are enabled by this tool, although many of the same actions can be performed manually in the AWS Console. See the installation guide for details on how to install the latest version locally.
Make sure to install the latest version of eksctl. Do not use the version available through your package manager (e.g. apt, dnf), which may be an older release that is missing features used in this guide.
Version >=0.192.0 is required to create EKS clusters with nodes using EKS accelerated AMIs.
Kubernetes Packages with helm
Helm is the package manager for Kubernetes. A package in Kubernetes is defined by a Helm Chart, which helps you define, install, and upgrade even the most complex Kubernetes application.
We use Helm to install several components in this guide. See the installation guide for details on how to install locally.
Creating a Cluster
Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service to run Kubernetes in the AWS cloud. In the cloud, Amazon EKS automatically manages the availability and scalability of the Kubernetes control plane nodes responsible for scheduling containers, managing application availability, storing cluster data, and other key tasks.
-
Download a
ClusterConfigtemplate from Deepgramβs self-hosted resources. For example, here is a template for a basic setup on AWS.- Set the cluster name (
metadata.name) and region according to your needs. - Ensure that the IAM role name for Amazon EFS is unique in your AWS account
iam.serviceAccounts[1].roleName. - Modify each managed node groupβs
desiredCapacityaccording to your needs. You may wish to consult your Deepgram Account Representative in planning your clusterβs capacity.
- Set the cluster name (
-
Create a new Kubernetes cluster in Amazon EKS using the
ClusterConfigmanifest. Thiseksctlcommand will create several AWS CloudFormation Stacks, which manage the inter-connected creation of a cluster, dedicated VPC, dedicated IAM, node groups, and other necessary resources.ShellMake sure to replace the
PATH_TO_CLUSTER_CONFIG_YAMLplaceholder with the path to the template file you downloaded on your local machine. -
Record metadata from your new cluster in shell variables for use in future steps.
Shell -
Create (or retrieve existing) an Amazon Elastic File System (EFS) to store Deepgram model files and share them across multiple Deepgram Engine pods.
Shell -
Install the Amazon EFS CSI driver to allow nodes within your cluster to access the EFS you created. Use the service account role we created via our
ClusterConfigfile, and wait until installation is complete.Shell -
eksctlautomatically creates several security groups when it provisions your cluster. One of these security groups facilitates communication between AWS-managed nodes and other AWS resources. Find this security group and record its ID for the next step.Shell -
Create mount targets on the EFS with the proper security group. This will allow all Deepgram Engine pods shared access to the EFS to read the model files that will be stored there.
Shell -
Record the Role ARN that will be used later to Install the Kubernetes Autoscaler, a component that automatically adjusts the size of a Kubernetes Cluster so that all pods have a place to run and there are no unneeded nodes.
Shell -
Create a dedicated namespace for Deepgram resources.
Shell
Configure Kubernetes Secrets
Deepgram strongly recommends following best practices for configuring Kubernetes Secrets. Please refer to Securing Your Cluster for more details.
The deepgram-self-hosted Helm chart takes two Secret references. One is a set of distribution credentials that allow the cluster to pull images from Deepgramβs container image repository. The other is your self-hosted API key that licenses each Deepgram container that is created.
-
Complete the Self Service Licensing & Credentials guide to generate distribution credentials and a self-hosted API key.
-
If using an external Secret store provider, configure cluster access to these two Secrets, naming them
dg-regcred(distribution credentials) anddg-self-hosted-api-key. -
If not using an external Secret store provider, create the Secrets manually in your cluster.
-
Using the distribution credentials username and password generated in the Deepgram Console, create a Kubernetes Secret named
dg-regcred.ShellReplace the placeholders
QUAY_DG_USERandQUAY_DG_PASSWORDwith the distribution credentials you generated in the Self Service Licensing & Credentials guide. -
Create a Kubernetes Secret named
dg-self-hosted-api-keyto store your self-hosted API key.ShellReplace the placeholder
YOUR_API_KEY_HEREwith the Deepgram API key you generated in the Self Service Licensing & Credentials guide.
-
Deploy Deepgram
Deepgram maintains the official deepgram-self-hosted Helm Chart. You can reference the source and Artifact Hub listing for more details. Weβll use this Chart to facilitate deploying Deepgram services in your self-hosted environment.
-
Shell
-
Download a
values.yamltemplate from Deepgramβs self-hosted resources. For example, here is a template for a basic setup on AWS. -
In your
values.yaml, modify thescaling.replicas.{api,engine}values to match your set the initial number of replicas when your cluster is created. The capacities were defined previously withdesiredCapacityin yourcluster-config.yamlfile.If you want to enable pod autoscaling in your cluster, reach out to your Deepgram Account Representative to discuss whether soft or hard limits make sense for your use case, and what values to use for scaling your cluster based on traffic demands.
-
In your
values.yamlfile, insert your Amazon EFS ID into theengine.modelManager.volumes.aws.efs.fileSystemIdvalue. You can get the ID from the shell variable you created previously.Shellyaml -
Your Deepgram Account Representative will have provided you with a list of links to models for inference (file extension
.dg). In yourvalues.yamlfile, insert each of these model links in theengine.modelManager.models.linkslist.yaml -
In your
values.yamlfile, insert the AWS Role ARN to be used by the Cluster Autoscaler. If needed, adjust the cluster name and region as well. -
Install the Helm Chart with your
values.yamlfile.ShellResource limits, taints, and other constraints may limit Pod scheduling. If a Pod is not able to be scheduled, you can see its status and a list of associated events with
kubectl describe pod <pod-name>.
Test Your Deepgram Setup with a Sample Request
Test your Deepgram deployment on Amazon EKS with an audio file.
-
Launch an ephemeral pod to send your test request from.
Shell -
Inside the ephemeral pod, download a sample file from Deepgram (or supply your own file).
Shell -
Send your audio file to your local Deepgram setup for transcription.
ShellIf needed, adjust pieces of the above command:
- the query parameters to match the directions from your Deepgram Account Representative
- the service name
deepgram-api-external - the namespace
dg-self-hosted
You should receive a JSON response with the transcript and associated metadata. Congratulations - your self-hosted setup is working!
Next Steps
Your Deepgram services are accessible within your cluster via the deepgram-api-external Service that was created by the Helm Chart.
You may consider configuring additional ingress with an AWS Application Load Balancer to access your services. Note that your installation will automatically load balance any received requests within the cluster to distribute load evenly. The load balancer would primarily serve as the ingress endpoint into the cluster.
Whatβs Next
Now that you have a basic Deepgram setup working, take some time to learn about building up to a production-level environment, as well as helpful Deepgram add-on services.