Deploy Deepgram Services

With Docker/Podman

Deepgram services are distributed as container images, as described in the Deployment Environments overview. We need to download and deploy these images from a container image repository. We also need to download configuration files and AI models that will be given to you directly by Deepgram.

Prerequisites

Before you begin, you will need to complete the Deployment Environments guide, as well as all sub-guides to complete your environment configuration.

You will also need to complete the Self Service Licensing & Credentials guide to authenticate your products with Deepgram's licensing servers and pull Deepgram container images from Quay.

Get Deepgram Products

Cache Container Image Repository Credentials

Use the image repository credentials you generated in the self-service on-prem licensing and credentials guide to login to Quay on your deployment environment. Once your credentials are cached locally, you should not have to log in again (until after you manually log out).

docker login quay.io
# Complete with login information generated in Deepgram Console

Pull Deepgram container images

  1. Identify the latest on-prem release in the Deepgram Changelog. Filter by "On-Prem", and select the latest release. You can use either the container image tag or the release tag listed for all images referenced in this documentation.

  2. Download the latest Deepgram Engine image from Docker:

    docker pull quay.io/deepgram/onprem-engine:IMAGE_OR_RELEASE_TAG
    
  3. Download the latest Deepgram API image from Docker:

    docker pull quay.io/deepgram/onprem-api:IMAGE_OR_RELEASE_TAG
    

🖥️

Be sure to replace the IMAGE_OR_RELEASE_TAG placeholder value with the appropriate tag identified in step 1. Use this tag in all related configuration files.

📘

The Deepgram Changelog may not have a domain prefix for the container images. Ensure that each image you pull has a quay.io domain prefix, as demonstrated in the commands above.

Import Your Docker Compose, Container Configuration, and Model Files

Before you can run your on-prem deployment, you must configure the required components. To do this, you will need to customize your configuration files and create a directory to house models that have been encrypted for use in your requests.

For your deployment, we provide models and configuration files to you via Amazon S3 buckets, so you can download directly to your deployment machine. If you aren't sure what files are available to you, please ask your Deepgram Account Representative.

🖥️

In the following steps, make sure to replace placeholder links with the actual links provided in the text file given to you by your Deepgram Account Representative.

  1. To house your configuration files, in your home directory, create a directory named config. This is where you will save your Docker Compose file and various Deepgram configuration files.

    mkdir config
    
  2. Download the docker-compose.yml file listed in the text file given to you by Deepgram:

    cd config
    wget LINK_TO_YAML_FILE_PROVIDED_BY_DEEPGRAM
    
  3. Download each of custom Deepgram configuration file links in the same text file:

    # Still in the ./config directory
    wget LINK_TO_TOML_CONFIGURATION_FILE_PROVIDED_BY_DEEPGRAM
    
  4. To house models that have been encrypted for use in your requests, in your root directory, create a directory named models:

    # Go back to the parent directory
    cd ..
    # Make the top-level models directory
    mkdir models
    
  5. In your new directory, create a fresh text file with a list of links to the provided models (file extension .dg).

    cd models
    touch model_links.txt
    # Edit this file with `vim`, `nano` or the editor of your choice
    

    After editing, your model_links.txt file should look like this:

    https://LINK_TO_MODEL_1.dg
    https://LINK_TO_MODEL_2.dg
    https://LINK_TO_MODEL_3.dg
    ...
    https://LINK_TO_MODEL_N.dg
    
  6. Download all the models specified in your model_links.txt file.

    wget --input-file model_links.txt
    

Selecting Models

Your Deepgram Account Representative provides you with a list of several different models depending on your use cases. For selecting language models to download, consider:

  • Language(s) you will transcribe. Each model file name contains a language code, such as en (English) or es (Spanish).
  • Pre-recorded (batch) or streaming transcription. Look for batch or streaming in the model file name.
  • Whether you prefer smart-formatted ("human-readable", with grammar and punctuation) or non-formatted transcription output. Look for formatted or non-formatted in the model file name.
  • Model architecture(s) you prefer, such as nova-2-general or enhanced.
  • Domain models, such as nova-2-phonecall.
  • Additional features you plan to use, including diarization, keywords, redaction, audio intelligence, etc.
    • Keywords requires the phoneme and g2p models.
    • Sentiment analysis, intent recognition, and topic detection require the sit model.
    • Each of the language detection, diarization, profanity filtering, summarization, and search features require its own model.

Putting it all together, the model nova-2-general.en.formatted.streaming.a12b345.dg will transcribe English with streaming audio input, using the Nova-2 General model, with smart-formatted output.

📘

Legacy Model File Naming

Models delivered prior to May 2024 may have different naming conventions for the model architecture. For example, they may read as 2-general-nova instead of nova-2-general in the filename.

The underlying model is identical, and both can be called with model=nova-2-general in your API calls.

Customize Your Configuration

Once you have downloaded all provided files to your deployment machine, you need to update your configuration for your specific deployment environment.

Credentials

You will need to have an environment variable DEEPGRAM_API_KEY exported with your on-prem API key secret. See our Self Service Licensing & Credentials guide for instructions on generating an on-prem API key for use in this section.

🚧

Per the link above, you will create you on-prem API key in the "API Key" tab of Deepgram Console. These are not created in the "On-Prem" tab, which is reserved for creating distribution credentials.

Configuration Files

docker-compose.yml

The Docker Compose configuration file makes it possible to spin up the containers using a single command. This makes spinning up a standard POC deployment quick and easy.

Make sure to export your on-prem API key secret in your deployment environment.

export DEEPGRAM_API_KEY=API_KEY_SECRET

The Docker Compose file with have several placeholder paths that you will need to replace. Paths such as /path/to/models or /path/to/api.toml exist under the volumes section of each container specification. Make sure to update these values to point to the directories and files prepared in the above Import section.

api.toml and engine.toml

The API and Engine images are configured with TOML configuration files. The files provided to you by Deepgram contain sane defaults that will work well for most use cases; these need to be mounted to the containers (see the volumes section in the docker-compose.yml file).

There are header comments describing each config value available in both of these files. If you have any questions about modifying these files, refer to those comments or reach out to your Deepgram Account Representative.

Adding the License Proxy

For customers deploying Deepgram’s on-premises solution in highly available production environments, Deepgram recommends the License Proxy, which is a caching proxy that communicates with the Deepgram-hosted license server to ensure uptime and simplify network security. See the License Proxy guide for more details on the benefits and setup.

📘

If you aren't certain which products your contract includes or if you are interested in adding the License Proxy to your on-premises deployment, please consult your Deepgram Account Representative.

Testing Your Containers

To make sure your Deepgram on-prem deployment is properly configured and running, you will want to run the containers and make a sample request.

Start the Deepgram Containers

Now that you have your configuration files and AI models set up and in the correct location to be used by the container, use Docker Compose to run the container:

cd config
# Running without elevated privileges
docker compose up -d
# Running with elevated privileges
sudo --preserve-env=DEEPGRAM_API_KEY docker compose up -d

📘

If you get an error similar to the following, you may not have the minimum NVIDIA driver version required for Deepgram services to run properly. Please see Drivers and Containerization Platforms for instructions on installing/upgrading to the latest driver version.

ERROR: for engine  Cannot start service engine: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'

nvidia-container-cli: initialization error: driver rpc error: timed out: unknown

ERROR: Encountered errors while bringing up the project.

You can then view the running containers with the container process status command, and optionally view the logs of each container to verify their status.

docker ps
# Take note of the "Container ID" for each Deepgram container
docker logs CONTAINER_ID

🖥️

Replace the placeholder CONTAINER_ID with the Container ID of each container whose logs you would like to inspect more completely.

Networking Considerations

If you are running your API and Engine nodes on separate instances, you may need to add an inbound rule for port 8080 to the API instances' security group, so that port 8080 is reachable from where you are initiating your requests.

Unless you have HTTPS or TLS running on your API instance, construct your Deepgram API endpoint with http://, not https://, and ws://, not wss:// (for instance, http://localhost:8080/v1/listen).

Test Your Deepgram Setup with a Sample Request

Test your environment and container setup with a local file.

  1. Download a sample file from Deepgram (or supply your own file).
    wget https://dpgr.am/bueller.wav
    
  2. Send your audio file to your local Deepgram setup for transcription.
    # If needed, adjust the query parameters to match the directions from your Deepgram Account Representative
    curl -X POST --data-binary @bueller.wav "http://localhost:8080/v1/listen?model=nova-2&smart_format=true"
    

📘

If you're using your own file, make sure to replace bueller.wav with the name of your audio file.

You should receive a JSON response with the transcription and associated metadata. Congratulations - your on-premises setup is working!


What’s Next

Now that you have a basic Deepgram setup working, take some time to learn about building up to a production-level environment, as well as helpful Deepgram add-on services.