Amazon Web Services
With Kubernetes
Deploying Deepgram on Amazon Web Services (AWS) requires some preparation. In this section, you will learn how to provision a managed Kubernetes Cluster where you will deploy Deepgram products. You will need to perform some of these steps in the AWS Management Console and some in your local terminal.
Prerequisites
Make sure you have completed the requirements in the Self-Hosted Introduction.
kubectl
The Kubernetes command-line tool, kubectl
, allows you to run commands against Kubernetes clusters. You can use kubectl
to deploy applications, inspect and manage cluster resources, and view logs.
Install locally using the official Kubernetes guides .
AWS CLI
The AWS CLI provides programmatic access to manage your AWS services. Certain steps in this guide are enabled by this tool, although many of the same actions can be performed manually in the AWS Console.
-
Follow the installation guide to install the CLI locally.
-
Once installed, follow the setup guide to configure the CLI with access to your AWS account. When configuring, set the default region to
us-west-2
.Choosing a Region
The templates and steps in this guide provision resources in the AWSus-west-2
region.If you would like to deploy to a different region, make sure to specify your desired region when running
aws configure
, and adjust templates and steps in this guide accordingly.
Cluster Management with eksctl
eksctl
is the official CLI for Amazon EKS. It simplifies creating and managing clusters by creating subnets, managed node groups, service accounts, and other resources to integrate with your cluster.
Certain steps in this guide are enabled by this tool, although many of the same actions can be performed manually in the AWS Console. See the installation guide for details on how to install the latest version locally.
Make sure to install the latest version of eksctl
. Do not use the version available through your package manager (e.g. apt
, dnf
), which may be an older release that is missing features used in this guide.
Version >=0.192.0
is required to create EKS clusters with nodes using EKS accelerated AMIs.
Kubernetes Packages with helm
Helm is the package manager for Kubernetes. A package in Kubernetes is defined by a Helm Chart, which helps you define, install, and upgrade even the most complex Kubernetes application.
We use Helm to install several components in this guide. See the installation guide for details on how to install locally.
Creating a Cluster
Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service to run Kubernetes in the AWS cloud. In the cloud, Amazon EKS automatically manages the availability and scalability of the Kubernetes control plane nodes responsible for scheduling containers, managing application availability, storing cluster data, and other key tasks.
-
Download a
ClusterConfig
template from Deepgram’s self-hosted resources. For example, here is a template for a basic setup on AWS.- Modify the cluster name and region according to your needs.
- Modify each managed node group’s
desiredCapacity
according to your needs. You may wish to consult your Deepgram Account Representative in planning your cluster’s capacity.
-
Create a new Kubernetes cluster in Amazon EKS using the
ClusterConfig
manifest. Thiseksctl
command will create several AWS CloudFormation Stacks, which manage the inter-connected creation of a cluster, dedicated VPC, dedicated IAM, node groups, and other necessary resources.ShellMake sure to replace the
PATH_TO_CLUSTER_CONFIG_YAML
placeholder with the path to the template file you downloaded on your local machine. -
Record metadata from your new cluster in shell variables for use in future steps.
Shell -
Create an Amazon Elastic File System (EFS) to store Deepgram model files and share them across multiple Deepgram Engine pods.
Shell -
Install the Amazon EFS CSI driver to allow nodes within your cluster to access the EFS you created. Use the service account role we created via our
ClusterConfig
file, and wait until installation is complete.Shell -
eksctl
automatically creates several security groups when it provisions your cluster. One of these security groups facilitates communication between AWS-managed nodes and other AWS resources. Find this security group and record its ID for the next step.Shell -
Create mount targets on the EFS with the proper security group. This will allow all Deepgram Engine pods shared access to the EFS to read the model files that will be stored there.
Shell -
Record the Role ARN that will be used later to Install the Kubernetes Autoscaler, a component that automatically adjusts the size of a Kubernetes Cluster so that all pods have a place to run and there are no unneeded nodes.
Shell -
Create a dedicated namespace for Deepgram resources.
Shell
Configure Kubernetes Secrets
Deepgram strongly recommends following best practices for configuring Kubernetes Secrets. Please refer to Securing Your Cluster for more details.
The deepgram-self-hosted
Helm chart takes two Secret references. One is a set of distribution credentials that allow the cluster to pull images from Deepgram’s container image repository. The other is your self-hosted API key that licenses each Deepgram container that is created.
-
Complete the Self Service Licensing & Credentials guide to generate distribution credentials and a self-hosted API key.
-
If using an external Secret store provider, configure cluster access to these two Secrets, naming them
dg-regcred
(distribution credentials) anddg-self-hosted-api-key
. -
If not using an external Secret store provider, create the Secrets manually in your cluster.
-
Using the distribution credentials username and password generated in the Deepgram Console, create a Kubernetes Secret named
dg-regcred
.ShellReplace the placeholders
QUAY_DG_USER
andQUAY_DG_PASSWORD
with the distribution credentials you generated in the Self Service Licensing & Credentials guide. -
Create a Kubernetes Secret named
dg-self-hosted-api-key
to store your self-hosted API key.ShellReplace the placeholder
YOUR_API_KEY_HERE
with the Deepgram API key you generated in the Self Service Licensing & Credentials guide.
-
Deploy Deepgram
Deepgram maintains the official deepgram-self-hosted
Helm Chart. You can reference the source and Artifact Hub listing for more details. We’ll use this Chart to facilitate deploying Deepgram services in your self-hosted environment.
-
Shell
-
Download a
values.yaml
template from Deepgram’s self-hosted resources. For example, here is a template for a basic setup on AWS. -
In your
values.yaml
, modify thescaling.replicas.{api,engine}
values to match your set the initial number of replicas when your cluster is created. The capacities were defined previously withdesiredCapacity
in yourcluster-config.yaml
file.If you want to enable pod autoscaling in your cluster, reach out to your Deepgram Account Representative to discuss whether soft or hard limits make sense for your use case, and what values to use for scaling your cluster based on traffic demands.
-
In your
values.yaml
file, insert your Amazon EFS ID into theengine.modelManager.volumes.aws.efs.fileSystemId
value. You can get the ID from the shell variable you created previously.Shellyaml -
Your Deepgram Account Representative will have provided you with a list of links to models for inference (file extension
.dg
). In yourvalues.yaml
file, insert each of these model links in theengine.modelManager.models.links
list.yaml -
In your
values.yaml
file, insert the AWS Role ARN to be used by the Cluster Autoscaler. If needed, adjust the cluster name and region as well. -
Install the Helm Chart with your
values.yaml
file.ShellResource limits, taints, and other constraints may limit Pod scheduling. If a Pod is not able to be scheduled, you can see its status and a list of associated events with
kubectl describe pod <pod-name>
.
Test Your Deepgram Setup with a Sample Request
Test your environment and container setup with a local file.
-
Get the name of one of the Deepgram API Pods.
Shell -
Launch an ephemeral container to send your test request from.
Shell -
Inside the ephemeral container, download a sample file from Deepgram (or supply your own file).
Shell -
Send your audio file to your local Deepgram setup for transcription.
ShellIf needed, adjust pieces of the above command:
- the query parameters to match the directions from your Deepgram Account Representative
- the service name
deepgram-api-external
- the namespace
dg-self-hosted
You should receive a JSON response with the transcript and associated metadata. Congratulations - your self-hosted setup is working!
Next Steps
Your Deepgram services are accessible within your cluster via the deepgram-api-external
Service that was created by the Helm Chart.
You may consider configuring additional ingress with an AWS Application Load Balancer to access your services. Note that your installation will automatically load balance any received requests within the cluster to distribute load evenly. The load balancer would primarily serve as the ingress endpoint into the cluster.
What’s Next
Now that you have a basic Deepgram setup working, take some time to learn about building up to a production-level environment, as well as helpful Deepgram add-on services.