Deploy Deepgram Services

With Docker/Podman

Deepgram services are distributed as container images, as described in the Deployment Environments overview. We need to download and deploy these images from a container image repository. We also need to download configuration files and AI models that will be given to you directly by Deepgram.

Prerequisites

Before you begin, you will need to complete the Deployment Environments guide, as well as all sub-guides to complete your environment configuration.

You will also need to complete the Self Service Licensing & Credentials guide to authenticate your products with Deepgram's licensing servers and pull Deepgram container images from Quay.

Get Deepgram Products

Cache Container Image Repository Credentials

Use the image repository credentials you generated in the self-service licensing and credentials guide to login to Quay on your deployment environment. Once your credentials are cached locally, you should not have to log in again (until after you manually log out).

# Complete with login information generated in Deepgram Console
docker login quay.io

Choosing A Deployment Type

For customers deploying Deepgram’s self-hosted solution in highly available production environments, Deepgram recommends the License Proxy, which is a caching proxy that communicates with the Deepgram-hosted license server to ensure uptime and simplify network security. If you aren't certain which products your contract includes or if you are interested in adding the License Proxy to your self-hosted deployment, please consult your Deepgram Account Representative.

If your project is authorized to use the License Proxy, in the following section you may choose to use either a standard deployment or a deployment with the License Proxy.

See the License Proxy guide for more details on the benefits and setup.

Import Your Docker Compose, Container Configuration, and Model Files

Before you can run your self-hosted deployment, you must configure the required components. To do this, you will need to customize your configuration files and create a directory to house models that have been encrypted for use in your requests.

  1. In your home directory, create the following directories. This is where you will save your Docker Compose files, Deepgram configuration files, and Deepgram model files.

    mkdir config
    mkdir models
    
  2. The following sub-steps will download Deepgram configuration template files:

    1. Choose to use either a standard deployment, or a deployment with the License Proxy if it is enabled for your project. Execute one of the following lines accordingly:

      # If using a deployment with the License Proxy, execute this line in your shell
      DEPLOY_TYPE="license-proxy"
      
      # If using a standard deployment, execute this line in your shell
      DEPLOY_TYPE="standard"
      
    2. Download the appropriate files from the self-hosted resources repository.

      BASE_URL="https://raw.githubusercontent.com/deepgram/self-hosted-resources/refs/heads/main"
      COMPOSE_FILES=("docker/docker-compose" "podman/podman-compose")
      CONFIG_FILES=("api.toml" "engine.toml")
      [ "$DEPLOY_TYPE" = "license-proxy" ] && CONFIG_FILES+=("license-proxy.toml")
      
      for file in "${COMPOSE_FILES[@]}"; do
      	filename=$(basename "$file")
      	curl -sSL "$BASE_URL/$file.$DEPLOY_TYPE.yml" -o "config/$filename.yml"
      done
      
      DEPLOY_DIR="${DEPLOY_TYPE//-/_}_deploy"
      for file in "${CONFIG_FILES[@]}"; do
      	curl -sSL "$BASE_URL/common/$DEPLOY_DIR/$file" \
      		-o "config/$file"
      done
      
    3. Modify the Docker/Podman templates to point to the correct paths.

      sed -i 's#/path/to/\(api\|engine\|license-proxy\).toml#./\1.toml#' config/*.yml
      
  3. Download the Deepgram model files (file extension .dg) that have been provided to you by your Deepgram Account Representative.

    1. Create a fresh text file and copy over the list of links to the provided models.

      touch model_links.txt
      # Edit this file with `vim`, `nano` or the editor of your choice
      

      After editing, your model_links.txt file should look like this:

      https://LINK_TO_MODEL_1.dg
      https://LINK_TO_MODEL_2.dg
      https://LINK_TO_MODEL_3.dg
      ...
      https://LINK_TO_MODEL_N.dg
      
    2. Download all the models specified in your model_links.txt file.

      wget --directory-prefix models --input-file model_links.txt
      
    3. Modify the Compose file templates to point to your models directory.

      sed -i 's#/path/to/models#../models#' config/*.yml
      

See the Model Maintenance guide for more details on how Deepgram models power voice AI inference in your self-hosted deployment.

Customize Your Configuration

Once you have downloaded all provided files to your deployment machine, you need to update your configuration for your specific deployment environment.

Credentials

You will need to have an environment variable DEEPGRAM_API_KEY exported with your self-hosted API key secret. See our Self Service Licensing & Credentials guide for instructions on generating a self-hosted API key for use in this section.

🚧

Per the link above, you will create you self-hosted API key in the API Key tab of Deepgram Console. These are not created in the "Self-Hosted" tab, which is reserved for creating distribution credentials.

Configuration Files

Compose File

The Docker Compose or Podman Compose configuration file makes it possible to spin up the containers using a single command. This makes spinning up a standard POC deployment quick and easy.

Make sure to export your self-hosted API key secret in your deployment environment.

export DEEPGRAM_API_KEY=API_KEY_SECRET

api.toml , engine.toml, and license-proxy.toml

The API and Engine containers, and the optional License Proxy container, are configured with TOML configuration files. The templates provided by Deepgram in the deepgram-self-hosted repository contain sane defaults that will work well for most use cases; these need to be mounted to the containers (see the volumes section in the Compose file).

There are header comments describing each config value available in both of these files. If you have any questions about modifying these files, refer to those comments or reach out to your Deepgram Account Representative.

Testing Your Containers

To make sure your Deepgram self-hosted deployment is properly configured and running, you will want to run the containers and make a sample request.

Start the Deepgram Containers

Now that you have your configuration files and AI models set up and in the correct location to be used by the container, use Docker Compose to run the container:

cd config

# Running without elevated privileges
docker compose up -d
# or `podman-compose up -d`

# Running with elevated privileges
sudo --preserve-env=DEEPGRAM_API_KEY docker compose up -d
# or `sudo --preserve-env=DEEPGRAM_API_KEY podman-compose up -d`

📘

If you get an error similar to the following, you may not have the minimum NVIDIA driver version required for Deepgram services to run properly. Please see Drivers and Containerization Platforms for instructions on installing/upgrading to the latest driver version.

ERROR: for engine  Cannot start service engine: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'

nvidia-container-cli: initialization error: driver rpc error: timed out: unknown

ERROR: Encountered errors while bringing up the project.

You can then view the running containers with the container process status command, and optionally view the logs of each container to verify their status.

docker ps
# Take note of the "Container ID" for each Deepgram container
docker logs CONTAINER_ID

🖥️

Replace the placeholder CONTAINER_ID with the Container ID of each container whose logs you would like to inspect more completely.

Networking Considerations

If you are running your API and Engine nodes on separate instances, you may need to add an inbound rule for port 8080 to the API instances' security group, so that port 8080 is reachable from where you are initiating your requests.

Unless you have HTTPS or TLS running on your API instance, construct your Deepgram API endpoint with http://, not https://, and ws://, not wss:// (for instance, http://localhost:8080/v1/listen).

Test Your Deepgram Setup with a Sample Request

Test your environment and container setup with a local file.

  1. Download a sample file from Deepgram (or supply your own file).
    wget https://dpgr.am/bueller.wav
    
  2. Send your audio file to your local Deepgram setup for transcription.
    # If needed, adjust the query parameters to match the directions from your Deepgram Account Representative
    curl -X POST --data-binary @bueller.wav "http://localhost:8080/v1/listen?model=nova-2&smart_format=true"
    

📘

If you're using your own file, make sure to replace bueller.wav with the name of your audio file.

You should receive a JSON response with the transcription and associated metadata. Congratulations - your self-hosted setup is working!


What’s Next

Now that you have a basic Deepgram setup working, take some time to learn about building up to a production-level environment, as well as helpful Deepgram add-on services.