Deepgram services are distributed as container images, as described in the Deployment Environments overview. We need to download and deploy these images from a container image repository. We also need to download configuration files and AI models that will be given to you directly by Deepgram.
Before you begin, you will need to complete the Deployment Environments guide, as well as all sub-guides to complete your environment configuration.
Use the image repository credentials you generated in the self-service on-prem licensing and credentials guide to login to Quay on your deployment environment. Once your credentials are cached locally, you should not have to log in again (until after you manually log out).
docker login quay.io
# Complete with login information generated in Deepgram Console
Identify the latest on-prem release in the Deepgram Changelog. Filter by "On-Prem", and select the latest release. You can use either the container image tag or the release tag listed for all images referenced in this documentation.
Download the latest Deepgram Engine image from Docker:
docker pull quay.io/deepgram/onprem-engine:IMAGE_OR_RELEASE_TAG
Download the latest Deepgram API image from Docker:
docker pull quay.io/deepgram/onprem-api:IMAGE_OR_RELEASE_TAG
Be sure to replace the
IMAGE_OR_RELEASE_TAGplaceholder value with the appropriate tag identified in step 1. Use this tag in all related configuration files.
The Deepgram Changelog may not have a domain prefix for the container images. Ensure that each image you pull has a
quay.iodomain prefix, as demonstrated in the commands above.
Before you can run your on-prem deployment, you must configure the required components. To do this, you will need to customize your configuration files and create a directory to house models that have been encrypted for use in your requests.
For your deployment, we provide models and configuration files to you via Amazon S3 buckets, so you can download directly to your deployment machine. If you aren't sure what files are available to you, please ask your Deepgram Account Representative.
In the following steps, make sure to replace placeholder links with the actual links provided in the text file given to you by your Deepgram Account Representative.
To house your configuration files, in your home directory, create a directory named
config. This is where you will save your Docker Compose file and various Deepgram configuration files.
docker-compose.ymlfile listed in the text file given to you by Deepgram:
cd config wget LINK_TO_YAML_FILE_PROVIDED_BY_DEEPGRAM
Download each of custom Deepgram configuration file links in the same text file:
# Still in the ./config directory wget LINK_TO_TOML_CONFIGURATION_FILE_PROVIDED_BY_DEEPGRAM
To house models that have been encrypted for use in your requests, in your root directory, create a directory named
# Go back to the parent directory cd .. # Make the top-level models directory mkdir models
In your new directory, download each model using the links in the text file provided by Deepgram:
cd models # Do for each model provided wget LINK_TO_MODEL_PROVIDED_BY_DEEPGRAM
Once you have downloaded all provided files to your deployment machine, you need to update your configuration for your specific deployment environment.
You will need to have an environment variable
DEEPGRAM_API_KEY exported with your on-prem API key secret. See our Self Service Licensing & Credentials guide for instructions on generating an on-prem API key for use in this section.
Per the link above, you will create you on-prem API key in the "API Key" tab of Deepgram Console. These are not created in the "On-Prem" tab, which is reserved for creating distribution credentials.
The Docker Compose configuration file makes it possible to spin up the containers using a single command. This makes spinning up a standard POC deployment quick and easy.
Make sure to export your on-prem API key secret in your deployment environment.
The Docker Compose file with have several placeholder paths that you will need to replace. Paths such as
/path/to/api.toml exist under the volumes section of each container specification. Make sure to update these values to point to the directories and files prepared in the above Import section.
The API and Engine images are configured with TOML configuration files. The files provided to you by Deepgram contain sane defaults that will work well for most use cases; these need to be mounted to the containers (see the
volumes section in the
There are header comments describing each config value available in both of these files. If you have any questions about modifying these files, refer to those comments or reach out to your Deepgram Account Representative.
For customers deploying Deepgram’s on-premises solution in highly available production environments, Deepgram recommends the License Proxy, which is a caching proxy that communicates with the Deepgram-hosted license server to ensure uptime and simplify network security. See the License Proxy guide for more details on the benefits and setup.
If you aren't certain which products your contract includes or if you are interested in adding the License Proxy to your on-premises deployment, please consult your Deepgram Account Representative.
To make sure your Deepgram on-prem deployment is properly configured and running, you will want to run the containers and make a sample request.
Now that you have your configuration files and AI models set up and in the correct location to be used by the container, use Docker Compose to run the container:
# Running without elevated privileges
docker compose up -d
# Running with elevated privileges
sudo --preserve-env=DEEPGRAM_API_KEY docker compose up -d
If you get an error similar to the following, you may not have the minimum NVIDIA driver version required for Deepgram services to run properly. Please see Drivers and Containerization Platforms for instructions on installing/upgrading to the latest driver version.
ERROR: for engine Cannot start service engine: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: driver rpc error: timed out: unknown ERROR: Encountered errors while bringing up the project.
You can then view the running containers with the container process status command, and optionally view the logs of each container to verify their status.
# Take note of the "Container ID" for each Deepgram container
docker logs CONTAINER_ID
Replace the placeholder
CONTAINER_IDwith the Container ID of each container whose logs you would like to inspect more completely.
Test your environment and container setup with a local file.
- Download a sample file from Deepgram (or supply your own file).
- Send your audio file to your local Deepgram setup for transcription.
# If needed, adjust the query parameters to match the directions from your Deepgram Account Representative curl -X POST --data-binary @bueller.wav "http://localhost:8080/v1/listen?model=nova-2&smart_format=true"
If you're using your own file, make sure to replace
bueller.wavwith the name of your audio file.
You should receive a JSON response with the transcription and associated metadata. Congratulations - your on-premises setup is working!
Updated about 2 months ago