System Maintenance

Periodically, you will update configuration files and models, install product updates for your self-hosted deployment, and rotate licensing credentials. A regularly updated system will be both more secure and more performant than a lagging one.

Updating Models

Deepgram model updates include releases of new architectures, features, and language support, as well an incremental improvements for existing models. Please contact your Deepgram Account Representative to find out whether there are updates available for the models you use.

Docker/Podman

You can download new models into your models/ directory to expose them to your system. You can look for a log line like the following in your Engine logs to demonstrate that a new model has been loaded:

2024-02-26T15:41:40.770498319Z  INFO load_model{path=/models/general.tar}: impeller::model_suppliers::autoload: Inserting model key=AsrKey { name: "general", version: "2023-02-22.3", languages: List(["en", ...]), aliases: {}, tags: [], uuid: 96a295ec-6336-43d5-b1cb-1e48b5e6d9a4, formatted: false, mode: All, architecture: None }

You may also verify that a new model UUID is serving your requests by inspecting the models or model_info fields in the request response metadata.

Kubernetes

Model files are typically exposed to Engine Pods via PersistentVolumeClaim backed by a PersistentVolume. See the deepgram-self-hosted Helm chart documentation on Persistent Storage Options for details on updating models in your cluster.

Host-Attached Storage

If your storage is attached to your host and supports inotify, no further action is required after a model is downloaded to your models directory. The Engine container detects and auto-loads new models promptly.

This applies both to new model names (e.g. the first time upgrading to Nova-2), as well as new versions of existing models (e.g. updating from version n to version n + 1 of Nova-2). Your requests are automatically served by the newest instance of each model present in your system, unless you override this behavior with the Version query parameter.

NFS

The Engine container relies on inotify events to detect new models. inotify is not supported by NFS, and therefore Engine containers with model directories on network-attached storage will not have models automatically detected. You will need to restart your Engine container(s) in order to use the new models.

Installing Product Updates

Deepgram regularly updates the container images for its self-hosted offering. Some updates are recommended, while others are mandatory and, if not installed, will result in Deepgram products ceasing to function.

You can identify the latest self-hosted release in the Deepgram Changelog. Filter by "Self-Hosted", and select the latest release. You can use either the Semver version tag or the release tag (release-XXXXXX) for each container image.

Docker/Podman

  1. Update the image field for all services in your docker-compose.yml or podman-compose.yml file with the desired tag.

  2. Refresh your container image repository credentials:

    docker login quay.io
    
  3. Restart existing containers. The new image tags in your Compose file will be automatically detected and the necessary container images will be downloaded and deployed.

    # Docker
    docker compose -f /path/to/docker-compose.yml up -d
    # Podman
    podman-compose -f /path/to/docker-compose.yml up -d
    

Kubernetes

In the deepgram-self-hosted Helm chart, you can specify the container image tag to use with the {api,engine,license-proxy}.image.tag values. See the values documentation for more details.

You can then run helm upgrade to roll these changes out to your cluster.

Updating Configuration Files

Besides the container image tags, there may be other updates to configuration files that are available. Deepgram's self-hosted-resources repository contains up-to-date configuration files for Docker and Podman, as well as the deepgram-self-hosted Helm chart. Consult this repository regularly to see if updates are available.

Docker/Podman

Updated files may include the Compose file and the API/Engine/License Proxy toml configuration files. If you modify your configuration files, you will need to restart your existing containers for changes to take effect.

# Docker
docker compose -f /path/to/docker-compose.yml up -d --force-recreate
# Podman
podman-compose -f /path/to/podman-compose.yml up -d --force-recreate

Kubernetes

The deepgram-self-hosted Helm chart has regular releases, and you should regularly update your Helm installation to use the latest version. The chart Changelog describes changes between versions, and the chart README contains upgrade instructions, as well as migration notes if a specific version contains breaking changes.

Existing Pods will need to be restarted when underlying ConfigMaps are changed. This is handled automatically in version >=0.2.1 of the deepgram-self-hosted Helm chart.

Managing Deepgram Licenses

You may manage your own Deepgram licensing per the Self Service Licensing & Credentials guide, including rotation and expiration of credentials.

If you have ever been issued a license file for specialized, offline/air-gapped deployments, please contact your Deepgram Account Representative for additional documentation about maintaining your license.

Backing up Deepgram Products

When backing up a Deepgram installation, you should back up all Infrastructure-as-Code artifacts.

  • Docker/Podman
    • Compose files (docker-compose.yml or podman-compose.yml)
    • api.toml
    • engine.toml
    • license-proxy.toml, if necessary
    • models directory
  • Kubernetes
    • Version of deepgram-self-hosted Helm chart that is in use
    • values.yaml configuration file

We highly recommend backing up your entire environment state, if possible. If you are using Docker/Podman, this can be done with a VM snapshot. If you are using Kubernetes, you may consider a cluster backup tool, such as Velero.


What’s Next

Now that you understand how to maintain your deployed Deepgram self-hosted environment, it's time to take a look at best practices related to autoscaling your system based on demand.