Deploy Deepgram Services

Deepgram services are distributed as container images, as described in the Deployment Environments overview. We need to download and deploy these images from a container image repository. We also need to download configuration files and AI models that will be given to you directly by Deepgram.

Prerequisites

Before you begin, you will need to complete the Deployment Environments guide, as well as all sub-guides to complete your environment configuration.

You will also need to complete the Self Service Licensing & Credentials guide to authenticate your products with Deepgram's licensing servers and pull Deepgram container images from Quay.

Get Deepgram Products

Cache Container Image Repository Credentials

Use the image repository credentials you generated in the self-service on-prem licensing and credentials guide to login to Quay on your deployment environment. Once your credentials are cached locally, you should not have to log in again (until after you manually log out).

Pull Deepgram container images

  1. Identify the latest on-prem release in the Deepgram Changelog. Filter by "On-Prem", and select the latest release. You can use either the container image tag or the release tag listed for all images referenced in this documentation.

  2. Download the latest Deepgram Engine image from Docker:

    docker pull quay.io/deepgram/onprem-engine:IMAGE_OR_RELEASE_TAG
    
  3. Download the latest Deepgram API image from Docker:

    docker pull quay.io/deepgram/onprem-api:IMAGE_OR_RELEASE_TAG
    

๐Ÿ–ฅ๏ธ

Be sure to replace the IMAGE_OR_RELEASE_TAG placeholder value with the appropriate tag identified in step 1. Use this tag in all related configuration files.

๐Ÿ“˜

The Deepgram Changelog may not have a domain prefix for the container images. Ensure that each image you pull has a quay.io domain prefix, as demonstrated in the commands above.

Import Your Docker Compose, Container Configuration, and Model Files

Before you can run your on-prem deployment, you must configure the required components. To do this, you will need to customize your configuration files and create a directory to house models that have been encrypted for use in your requests.

For your deployment, we provide models and configuration files to you via Amazon S3 buckets, so you can download directly to your deployment machine. If you donโ€™t have customized configuration files, you can create configuration files using the examples included in Customize Your Configuration.

  1. To house your configuration files, in your root directory, create a directory named config. This is where you will save your Docker Compose file and various Deepgram configuration files.

    mkdir config
    
  2. Download the docker-compose.yml file provided by Deepgram:

    cd config
    wget LINK_TO_YAML_FILE_PROVIDED_BY_DEEPGRAM
    

    ๐Ÿ–ฅ๏ธ

    Be sure to replace the LINK_TO_YAML_FILE_PROVIDED_BY_DEEPGRAM placeholder value with the URL to the docker-compose.yml file provided by your Deepgram Account Representative.

  3. If you have been provided with custom Deepgram configuration file links, download each:

    cd config
    wget LINK_TO_TOML_CONFIGURATION_FILE_PROVIDED_BY_DEEPGRAM
    

    ๐Ÿ–ฅ๏ธ

    Be sure to replace each LINK_TO_TOML_CONFIGURATION_FILE_PROVIDED_BY_DEEPGRAM placeholder value with the URL to the TOML configuration file provided by your Deepgram Account Representative.

  4. To house models that have been encrypted for use in your requests, in your root directory, create a directory named models:

    mkdir models
    

    โ„น๏ธ

    You can name this directory whatever you like, as long as you update the docker-compose.yml and engine.toml files accordingly.

  5. In your new directory, download each model using the links provided by Deepgram:

    cd models
    wget LINK_TO_MODEL_PROVIDED_BY_DEEPGRAM
    

    ๐Ÿ–ฅ๏ธ

    Be sure to replace each LINK_TO_MODEL_PROVIDED_BY_DEEPGRAM placeholder value with the URL to the model provided by your Deepgram Account Representative.

Customize Your Configuration

Once you have downloaded all provided files to your deployment machine, you can update your configuration to customize it for your use case.

Credentials

You will need to have an environment variable DEEPGRAM_API_KEY exported with your on-prem API key secret. See our Self Service Licensing & Credentials guide for instructions on generating an on-prem API key for use in this section.

๐Ÿšง

Per the link above, you will create you on-prem API key in the "API Key" tab of Deepgram Console. These are not created in the "On-Prem" tab, which is reserved for creating distribution credentials.

Configuration Files

๐Ÿ“˜

Your Deepgram Account Representative should provide you with download links to customized configuration files. Unless further customized, the example files included in this section are basic and should be used only to spin up a standard proof of concept (POC) deployment or to test your system.

docker-compose.yml

The Docker Compose configuration file makes it possible to spin up the containers using a single command. This makes spinning up a standard POC deployment quick and easy.

Make sure to export your on-prem API key secret in your deployment environment.

export DEEPGRAM_API_KEY=API_KEY_SECRET
version: "3.7"

services:

  # The speech API service.
  api:
    image: quay.io/deepgram/onprem-api:IMAGE_OR_RELEASE_TAG

    # Here we expose the API port to the host machine. The container port
    # (right-hand side) must match the port that the API service is listening
    # on (from its configuration file).
    ports:
      - "8080:8080"

    # Make sure you `export` your on-prem API key secret in your local environment
    environment:  
      DEEPGRAM_API_KEY: "${DEEPGRAM_API_KEY}"

    # Uncomment the following two lines if you are specifying a custom API configuration.
    # volumes:
    # - "/path/to/api.toml:/api.toml:ro,Z"
    # The path on the left of the colon ':' should point to files/directories on the host machine.
    # The path on the right of the colon ':' is an in-container path. It must match the path
    #     specified in the `command` header below.

     # Invoke the API server
    command: -v serve /api.toml 
  
  # The speech engine service.
  engine:
    image: quay.io/deepgram/onprem-engine:IMAGE_OR_RELEASE_TAG

    # Change the default runtime.
    runtime: nvidia

    ports:
      - "9991:9991"

    # Make sure you `export` your on-prem API key secret in your local environment
    environment:  
      DEEPGRAM_API_KEY: "${DEEPGRAM_API_KEY}"

    # The path on the left of the colon ':' should point to files/directories on the host machine.
    # The path on the right of the colon ':' is an in-container path.
    volumes:
    # In-container path below must match the one specified in the Engine configuration file. The default location is "/models"
      - "/path/to/models:/models:ro,Z"
    # Uncomment the following line if you are specifying a custom Engine configuration.
      # - "/path/to/engine.toml:/engine.toml:ro,Z"
    # In-container path above must match the path specified in the `command` header below.

    # Invoke the Engine service
    command: -v serve /engine.toml

๐Ÿ–ฅ๏ธ

Be sure to replace the IMAGE_OR_RELEASE_TAG placeholder value with the appropriate tag you downloaded in the Pull Deepgram Container Images step.

api.toml and engine.toml

The API and Engine images are configured with TOML configuration files. Versions of the files with sane defaults are bundled with the API and Engine Docker images. If you want to specify custom configurations, please contact your Deepgram Account Representative.

Adding the License Proxy

For customers deploying Deepgramโ€™s on-premises solution in highly available production environments, Deepgram recommends the License Proxy, which is a caching proxy that communicates with the Deepgram-hosted license server to ensure uptime and simplify network security. See the License Proxy guide for more details on the benefits and setup.

๐Ÿ“˜

If you aren't certain which products your contract includes or if you are interested in adding the License Proxy to your on-premises deployment, please consult a Deepgram Account Representative. To reach one, contact us!

Testing Your Containers

To make sure your Deepgram on-prem deployment is properly configured and running, you will want to run the containers and make a sample request to Deepgram.

Start the Deepgram Containers

Now that you have your configuration files and AI models set up and in the correct location to be used by the container, use Docker Compose to run the container:

cd config
docker compose up -d

๐Ÿ“˜

If you get an error similar to the following, you may not have the minimum NVIDIA driver version required for Deepgram services to run properly. Please see Drivers and Containerization Platforms for instructions on installing/upgrading to the latest driver version.

ERROR: for engine  Cannot start service engine: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'

nvidia-container-cli: initialization error: driver rpc error: timed out: unknown

ERROR: Encountered errors while bringing up the project.

You can then view the running containers with the container process status command, and optionally view the logs of each container to verify their status.

docker ps
# Take note of the "Container ID" for each Deepgram container
docker logs CONTAINER_ID

๐Ÿ–ฅ๏ธ

Replace the placeholder CONTAINER_ID with the Container ID of each container whose logs you would like to inspect more completely.

Test Your Deepgram Setup with a Sample Request

Test your environment and container setup with a local file.

  1. Download a sample file from Deepgram (or supply your own file).
    wget https://dpgr.am/bueller.wav
    
  2. Send your audio file to your local Deepgram setup for transcription.
    curl -X POST --data-binary @bueller.wav "http://localhost:8080/v1/listen"
    

๐Ÿ“˜

If you're using your own file, make sure to replace bueller.wav with the name of your audio file.

You should receive a JSON response with the transcription and associated metadata. Congratulations - your on-premises setup is working!


Whatโ€™s Next

Now that you have a basic Deepgram setup working, take some time to learn about building up to a production-level environment, as well as helpful Deepgram add-on services.