Deploy Deepgram Services

With Docker/Podman

Deepgram services are distributed as container images, as described in the Deployment Environments overview. We need to download and deploy these images from a container image repository. We also need to download configuration files and AI models that will be given to you directly by Deepgram.

Prerequisites

Before you begin, you will need to complete the Deployment Environments guide, as well as all sub-guides to complete your environment configuration.

You will also need to complete the Self Service Licensing & Credentials guide to authenticate your products with Deepgram's licensing servers and pull Deepgram container images from Quay.

Get Deepgram Products

Cache Container Image Repository Credentials

Use the image repository credentials you generated in the self-service licensing and credentials guide to login to Quay on your deployment environment. Once your credentials are cached locally, you should not have to log in again (until after you manually log out).

# Complete with login information generated in Deepgram Console
docker login quay.io

Pull Deepgram container images

  1. Identify the latest self-hosted release in the Deepgram Changelog. Filter by "Self-Hosted", and select the latest release. You can use either the container image tag or the release tag listed for all images referenced in this documentation.

  2. Pull the latest Deepgram Engine image:

    docker pull quay.io/deepgram/self-hosted-engine:IMAGE_OR_RELEASE_TAG
    
  3. Pull the latest Deepgram API image:

    docker pull quay.io/deepgram/self-hosted-api:IMAGE_OR_RELEASE_TAG
    

🖥️

Be sure to replace the IMAGE_OR_RELEASE_TAG placeholder value with the appropriate tag identified in step 1. Use this tag in all related configuration files.

Choosing A Deployment Type

For customers deploying Deepgram’s self-hosted solution in highly available production environments, Deepgram recommends the License Proxy, which is a caching proxy that communicates with the Deepgram-hosted license server to ensure uptime and simplify network security. If you aren't certain which products your contract includes or if you are interested in adding the License Proxy to your self-hosted deployment, please consult your Deepgram Account Representative.

If your project is authorized to use the License Proxy, in the following section you may choose to use either a standard deployment or a deployment with the License Proxy.

See the License Proxy guide for more details on the benefits and setup.

Import Your Docker Compose, Container Configuration, and Model Files

Before you can run your self-hosted deployment, you must configure the required components. To do this, you will need to customize your configuration files and create a directory to house models that have been encrypted for use in your requests.

🖥️

In the following steps, make sure to replace placeholder links with the actual links provided in the text file given to you by your Deepgram Account Representative.

  1. To house your configuration files, in your home directory, create a directory named config. This is where you will save your Docker Compose file and various Deepgram configuration files.

    mkdir config
    
  2. Download template files from the Deepgram self-hosted-resources repo. There are seperate sections for Docker and Podman.

    1. Choose a Compose file template for either a standard deployment, or a deployment with the License Proxy if it is enabled for your project. Once viewing your desired template, click the Raw button and copy the URL. Use the URL to download the Compose file template into your self-hosted environment.

      cd config
      # LINK_TO_COMPOSE_FILE_TEMPLATE should look like the following:
      #   https://raw.githubusercontent.com/deepgram/self-hosted-resources/main/...
      wget LINK_TO_COMPOSE_FILE_TEMPLATE
      
    2. In the "Contents" section where you originally selected a Compose file template (Docker section , Podman section), there are links to associated Deepgram configuration files. Click on the appropriate link.

      1. These will take you to a directory with api.toml and engine.toml files, as well as a license-proxy.toml file if you are doing a deployment with the Deepgram License Proxy.
    3. Select each file, click the Raw button, and copy the URL. Download each of custom Deepgram configuration file links.

      # Still in the ./config directory
      # You should have multiple links to download, and each link should have the format:
      #   https://raw.githubusercontent.com/deepgram/self-hosted-resources/main/common/**/*.toml
      wget LINK_TO_TOML_CONFIGURATION_FILE_PROVIDED_BY_DEEPGRAM
      
  3. To house models that have been encrypted for use in your requests, in your root directory, create a directory named models:

    # Go back to the parent directory
    cd ..
    # Make the top-level models directory
    mkdir models
    
  4. Your Deepgram Account Representative will provide you with a text file containing a list of download links for Deepgram models (file extension .dg). In your new directory, create a fresh text file and copy over the list of links to the provided models.

    cd models
    touch model_links.txt
    # Edit this file with `vim`, `nano` or the editor of your choice
    

    After editing, your model_links.txt file should look like this:

    https://LINK_TO_MODEL_1.dg
    https://LINK_TO_MODEL_2.dg
    https://LINK_TO_MODEL_3.dg
    ...
    https://LINK_TO_MODEL_N.dg
    
  5. Download all the models specified in your model_links.txt file.

    wget --input-file model_links.txt
    

Selecting Models

Your Deepgram Account Representative provides you with a list of several different models depending on your use cases. For selecting language models to download, consider:

  • Language(s) you will transcribe. Each model file name contains a language code, such as en (English) or es (Spanish).
  • Pre-recorded (batch) or streaming transcription. Look for batch or streaming in the model file name.
  • Whether you prefer smart-formatted ("human-readable", with grammar and punctuation) or non-formatted transcription output. Look for formatted or non-formatted in the model file name.
  • Model architecture(s) you prefer, such as nova-2-general or enhanced.
  • Domain models, such as nova-2-phonecall.
  • Additional features you plan to use, including diarization, keywords, redaction, audio intelligence, etc.
    • Keywords requires the phoneme and g2p models.
    • Sentiment analysis, intent recognition, and topic detection require the sit model.
    • Each of the language detection (language-detector), diarization (diarizer), profanity filtering, summarization, and search features require its own model.

Putting it all together, the model nova-2-general.en.formatted.streaming.a12b345.dg will transcribe English with streaming audio input, using the Nova-2 General model, with smart-formatted output.

📘

Legacy Model File Naming

Models delivered prior to May 2024 may have different naming conventions for the model architecture. For example, they may read as 2-general-nova instead of nova-2-general in the filename.

The underlying model is identical, and both can be called with model=nova-2-general in your API calls.

Customize Your Configuration

Once you have downloaded all provided files to your deployment machine, you need to update your configuration for your specific deployment environment.

Credentials

You will need to have an environment variable DEEPGRAM_API_KEY exported with your self-hosted API key secret. See our Self Service Licensing & Credentials guide for instructions on generating a self-hosted API key for use in this section.

🚧

Per the link above, you will create you self-hosted API key in the API Key tab of Deepgram Console. These are not created in the "On-Prem" tab, which is reserved for creating distribution credentials.

Configuration Files

Compose File

The Docker Compose or Podman Compose configuration file makes it possible to spin up the containers using a single command. This makes spinning up a standard POC deployment quick and easy.

Make sure to export your self-hosted API key secret in your deployment environment.

export DEEPGRAM_API_KEY=API_KEY_SECRET

The Compose file with have several placeholder paths that you will need to replace. Paths such as /path/to/models or /path/to/api.toml exist under the volumes section of each container specification. Make sure to update these values to point to the directories and files prepared in the above Import section.

api.toml , engine.toml, and license-proxy.toml

The API and Engine containers, and the optional License Proxy container, are configured with TOML configuration files. The templates provided by Deepgram in the deepgram-self-hosted repository contain sane defaults that will work well for most use cases; these need to be mounted to the containers (see the volumes section in the Compose file).

There are header comments describing each config value available in both of these files. If you have any questions about modifying these files, refer to those comments or reach out to your Deepgram Account Representative.

Testing Your Containers

To make sure your Deepgram self-hosted deployment is properly configured and running, you will want to run the containers and make a sample request.

Start the Deepgram Containers

Now that you have your configuration files and AI models set up and in the correct location to be used by the container, use Docker Compose to run the container:

cd config

# Running without elevated privileges
docker compose up -d
# or `podman-compose up -d`

# Running with elevated privileges
sudo --preserve-env=DEEPGRAM_API_KEY docker compose up -d
# or `sudo --preserve-env=DEEPGRAM_API_KEY podman-compose up -d`

📘

If you get an error similar to the following, you may not have the minimum NVIDIA driver version required for Deepgram services to run properly. Please see Drivers and Containerization Platforms for instructions on installing/upgrading to the latest driver version.

ERROR: for engine  Cannot start service engine: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'

nvidia-container-cli: initialization error: driver rpc error: timed out: unknown

ERROR: Encountered errors while bringing up the project.

You can then view the running containers with the container process status command, and optionally view the logs of each container to verify their status.

docker ps
# Take note of the "Container ID" for each Deepgram container
docker logs CONTAINER_ID

🖥️

Replace the placeholder CONTAINER_ID with the Container ID of each container whose logs you would like to inspect more completely.

Networking Considerations

If you are running your API and Engine nodes on separate instances, you may need to add an inbound rule for port 8080 to the API instances' security group, so that port 8080 is reachable from where you are initiating your requests.

Unless you have HTTPS or TLS running on your API instance, construct your Deepgram API endpoint with http://, not https://, and ws://, not wss:// (for instance, http://localhost:8080/v1/listen).

Test Your Deepgram Setup with a Sample Request

Test your environment and container setup with a local file.

  1. Download a sample file from Deepgram (or supply your own file).
    wget https://dpgr.am/bueller.wav
    
  2. Send your audio file to your local Deepgram setup for transcription.
    # If needed, adjust the query parameters to match the directions from your Deepgram Account Representative
    curl -X POST --data-binary @bueller.wav "http://localhost:8080/v1/listen?model=nova-2&smart_format=true"
    

📘

If you're using your own file, make sure to replace bueller.wav with the name of your audio file.

You should receive a JSON response with the transcription and associated metadata. Congratulations - your self-hosted setup is working!


What’s Next

Now that you have a basic Deepgram setup working, take some time to learn about building up to a production-level environment, as well as helpful Deepgram add-on services.