Deploy Deepgram Services
With Docker/Podman
Deepgram services are distributed as container images, as described in the Deployment Environments overview. We need to download and deploy these images from a container image repository. We also need to download configuration files and AI models that will be given to you directly by Deepgram.
Prerequisites
Before you begin, you will need to complete the Deployment Environments guide, as well as all sub-guides to complete your environment configuration.
You will also need to complete the Self Service Licensing & Credentials guide to authenticate your products with Deepgram's licensing servers and pull Deepgram container images from Quay.
Get Deepgram Products
Cache Container Image Repository Credentials
Use the image repository credentials you generated in the self-service licensing and credentials guide to login to Quay on your deployment environment. Once your credentials are cached locally, you should not have to log in again (until after you manually log out).
# Complete with login information generated in Deepgram Console
docker login quay.io
Pull Deepgram container images
-
Identify the latest self-hosted release in the Deepgram Changelog. Filter by "Self-Hosted", and select the latest release. You can use either the container image tag or the release tag listed for all images referenced in this documentation.
-
Pull the latest Deepgram Engine image:
docker pull quay.io/deepgram/self-hosted-engine:IMAGE_OR_RELEASE_TAG
-
Pull the latest Deepgram API image:
docker pull quay.io/deepgram/self-hosted-api:IMAGE_OR_RELEASE_TAG
Be sure to replace the
IMAGE_OR_RELEASE_TAG
placeholder value with the appropriate tag identified in step 1. Use this tag in all related configuration files.
Choosing A Deployment Type
For customers deploying Deepgram’s self-hosted solution in highly available production environments, Deepgram recommends the License Proxy, which is a caching proxy that communicates with the Deepgram-hosted license server to ensure uptime and simplify network security. If you aren't certain which products your contract includes or if you are interested in adding the License Proxy to your self-hosted deployment, please consult your Deepgram Account Representative.
If your project is authorized to use the License Proxy, in the following section you may choose to use either a standard deployment or a deployment with the License Proxy.
See the License Proxy guide for more details on the benefits and setup.
Import Your Docker Compose, Container Configuration, and Model Files
Before you can run your self-hosted deployment, you must configure the required components. To do this, you will need to customize your configuration files and create a directory to house models that have been encrypted for use in your requests.
In the following steps, make sure to replace placeholder links with the actual links provided in the text file given to you by your Deepgram Account Representative.
-
To house your configuration files, in your home directory, create a directory named
config
. This is where you will save your Docker Compose file and various Deepgram configuration files.mkdir config
-
Download template files from the Deepgram
self-hosted-resources
repo. There are seperate sections for Docker and Podman.-
Choose a Compose file template for either a standard deployment, or a deployment with the License Proxy if it is enabled for your project. Once viewing your desired template, click the
Raw
button and copy the URL. Use the URL to download the Compose file template into your self-hosted environment.cd config # LINK_TO_COMPOSE_FILE_TEMPLATE should look like the following: # https://raw.githubusercontent.com/deepgram/self-hosted-resources/main/... wget LINK_TO_COMPOSE_FILE_TEMPLATE
-
In the "Contents" section where you originally selected a Compose file template (Docker section , Podman section), there are links to associated Deepgram configuration files. Click on the appropriate link.
- These will take you to a directory with
api.toml
andengine.toml
files, as well as alicense-proxy.toml
file if you are doing a deployment with the Deepgram License Proxy.
- These will take you to a directory with
-
Select each file, click the
Raw
button, and copy the URL. Download each of custom Deepgram configuration file links.# Still in the ./config directory # You should have multiple links to download, and each link should have the format: # https://raw.githubusercontent.com/deepgram/self-hosted-resources/main/common/**/*.toml wget LINK_TO_TOML_CONFIGURATION_FILE_PROVIDED_BY_DEEPGRAM
-
-
To house models that have been encrypted for use in your requests, in your root directory, create a directory named
models
:# Go back to the parent directory cd .. # Make the top-level models directory mkdir models
-
Your Deepgram Account Representative will provide you with a text file containing a list of download links for Deepgram models (file extension
.dg
). In your new directory, create a fresh text file and copy over the list of links to the provided models.cd models touch model_links.txt # Edit this file with `vim`, `nano` or the editor of your choice
After editing, your
model_links.txt
file should look like this:https://LINK_TO_MODEL_1.dg https://LINK_TO_MODEL_2.dg https://LINK_TO_MODEL_3.dg ... https://LINK_TO_MODEL_N.dg
-
Download all the models specified in your
model_links.txt
file.wget --input-file model_links.txt
Selecting Models
Your Deepgram Account Representative provides you with a list of several different models depending on your use cases. For selecting language models to download, consider:
- Language(s) you will transcribe. Each model file name contains a language code, such as
en
(English) ores
(Spanish). - Pre-recorded (batch) or streaming transcription. Look for
batch
orstreaming
in the model file name. - Whether you prefer smart-formatted ("human-readable", with grammar and punctuation) or non-formatted transcription output. Look for
formatted
ornon-formatted
in the model file name. - Model architecture(s) you prefer, such as
nova-2-general
orenhanced
. - Domain models, such as
nova-2-phonecall
. - Additional features you plan to use, including diarization, keywords, redaction, audio intelligence, etc.
- Keywords requires the
phoneme
andg2p
models. - Sentiment analysis, intent recognition, and topic detection require the
sit
model. - Each of the language detection (
language-detector
), diarization (diarizer
), profanity filtering, summarization, and search features require its own model.
- Keywords requires the
Putting it all together, the model nova-2-general.en.formatted.streaming.a12b345.dg
will transcribe English with streaming audio input, using the Nova-2 General model, with smart-formatted output.
Legacy Model File Naming
Models delivered prior to May 2024 may have different naming conventions for the model architecture. For example, they may read as
2-general-nova
instead ofnova-2-general
in the filename.The underlying model is identical, and both can be called with
model=nova-2-general
in your API calls.
Customize Your Configuration
Once you have downloaded all provided files to your deployment machine, you need to update your configuration for your specific deployment environment.
Credentials
You will need to have an environment variable DEEPGRAM_API_KEY
exported with your self-hosted API key secret. See our Self Service Licensing & Credentials guide for instructions on generating a self-hosted API key for use in this section.
Per the link above, you will create you self-hosted API key in the
API Key
tab of Deepgram Console. These are not created in the "On-Prem" tab, which is reserved for creating distribution credentials.
Configuration Files
Compose File
The Docker Compose or Podman Compose configuration file makes it possible to spin up the containers using a single command. This makes spinning up a standard POC deployment quick and easy.
Make sure to export your self-hosted API key secret in your deployment environment.
export DEEPGRAM_API_KEY=API_KEY_SECRET
The Compose file with have several placeholder paths that you will need to replace. Paths such as /path/to/models
or /path/to/api.toml
exist under the volumes section of each container specification. Make sure to update these values to point to the directories and files prepared in the above Import section.
api.toml
, engine.toml
, and license-proxy.toml
api.toml
, engine.toml
, and license-proxy.toml
The API and Engine containers, and the optional License Proxy container, are configured with TOML configuration files. The templates provided by Deepgram in the deepgram-self-hosted
repository contain sane defaults that will work well for most use cases; these need to be mounted to the containers (see the volumes
section in the Compose file).
There are header comments describing each config value available in both of these files. If you have any questions about modifying these files, refer to those comments or reach out to your Deepgram Account Representative.
Testing Your Containers
To make sure your Deepgram self-hosted deployment is properly configured and running, you will want to run the containers and make a sample request.
Start the Deepgram Containers
Now that you have your configuration files and AI models set up and in the correct location to be used by the container, use Docker Compose to run the container:
cd config
# Running without elevated privileges
docker compose up -d
# or `podman-compose up -d`
# Running with elevated privileges
sudo --preserve-env=DEEPGRAM_API_KEY docker compose up -d
# or `sudo --preserve-env=DEEPGRAM_API_KEY podman-compose up -d`
If you get an error similar to the following, you may not have the minimum NVIDIA driver version required for Deepgram services to run properly. Please see Drivers and Containerization Platforms for instructions on installing/upgrading to the latest driver version.
ERROR: for engine Cannot start service engine: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: driver rpc error: timed out: unknown ERROR: Encountered errors while bringing up the project.
You can then view the running containers with the container process status command, and optionally view the logs of each container to verify their status.
docker ps
# Take note of the "Container ID" for each Deepgram container
docker logs CONTAINER_ID
Replace the placeholder
CONTAINER_ID
with the Container ID of each container whose logs you would like to inspect more completely.
Networking Considerations
If you are running your API and Engine nodes on separate instances, you may need to add an inbound rule for port 8080 to the API instances' security group, so that port 8080 is reachable from where you are initiating your requests.
Unless you have HTTPS or TLS running on your API instance, construct your Deepgram API endpoint with http://
, not https://
, and ws://
, not wss://
(for instance, http://localhost:8080/v1/listen
).
Test Your Deepgram Setup with a Sample Request
Test your environment and container setup with a local file.
- Download a sample file from Deepgram (or supply your own file).
wget https://dpgr.am/bueller.wav
- Send your audio file to your local Deepgram setup for transcription.
# If needed, adjust the query parameters to match the directions from your Deepgram Account Representative curl -X POST --data-binary @bueller.wav "http://localhost:8080/v1/listen?model=nova-2&smart_format=true"
If you're using your own file, make sure to replace
bueller.wav
with the name of your audio file.
You should receive a JSON response with the transcription and associated metadata. Congratulations - your self-hosted setup is working!
Updated 22 days ago
Now that you have a basic Deepgram setup working, take some time to learn about building up to a production-level environment, as well as helpful Deepgram add-on services.