Configure Hardware & Software

The following are sample recommended hardware layouts. These different configurations can overlap on a physical machine: a system prepared for training can also be used for inference and labeling.

PurposeSample Recommended Hardware Layout
Inference
(performing speech analysis on trained speech models)
  • 1 K80 or better NVIDIA GPU with at least 8GB GPU RAM
  • 4 CPU cores
  • 32 GB RAM
  • 32 GB storage


A system of this sort will typically provide a 50-100x real-time speedup. For higher throughput, please consult a Deepgram sales representative for a customized hardware recommendation.

An Amazon EC2 p2.xlarge instance works well for a baseline deployment. A p3.2xlarge is a very cost-effective way to achieve substantially higher throughput.
Training
(creating new custom models)
  • 1 K80 or better NVIDIA GPU per training process minimum with at least 8GB GPU RAM (16 GB GPU recommended)
  • 32 CPU cores
  • 32 GB RAM minimum (64+ GB recommended, however actual memory required will depend on total training set size.)
  • 32 GB storage minimum (128+ GB preferred, more may be required as multiple models are trained)


The recommended GPU is at least one V100 (the p3.2xlarge instance on Amazon EC2).

Sizes do not include training data storage requirements. To train a model, approximately 3x the training data size is required.
Labeling
(creating new training datasets)
  • 2 CPU cores
  • 4 GPU RAM
  • 32 GB storage


Typically, an Amazon EC2 instance like the a1.large orā€”for small transcription workloadsā€”an a1.medium will suffice.

Sizes do not include training data storage requirements.

Deepgram's software is intended to be run using Docker, a containerization technology used to isolate, secure, and deploy applications across many different host environments and operating systems. Containers are more light-weight than virtual machines, with many of the same isolation benefits.

Although Docker can be run from many different host operating systems, we recommend using Ubuntu 18.04 LTS or a similar Linux distribution, as we have tested our products most extensively in these OSes.

For the best scaling experience, or for high availability deployments, we also recommend a multi-host orchestration solution, such as Docker Swarm (recommended) or Kubernetes.

Docker

Ensure that Docker Engine is installed on all hosts, version 18.06 or later. Make sure the user you use is in thedocker group, so that it has sufficient permissions to communicate with the Docker Daemon (system service).

To test that Docker is installed properly, run:

bash
$ docker --version

If you will be using Docker Swarm, follow Docker's official guide to ensure that swarm mode is enabled on all hosts, that an appropriate number of master nodes have been configured, and that there are no networking issues.

To test your Swarm configuration, run the following from a master node:

bash
$ docker node ls
Cache Docker Credentials

Once you are satisfied that Docker is installed and configured correctly, cache your credentials locally by logging in to Docker Hub using the Docker credentials you created earlier:

bash
$ docker login

This will make it will make it so that these commands should not need to be executed multiple times unless you log out (docker logout).

Please be aware of any security concerns surrounding caching credentials.

Firewall

Docker typically takes care of networking concerns, so as long as your firewall is configured to allow Docker (and Docker Swarm, where appropriate), you should have no special concerns.

Deepgram server containers typically listen on port 8080 inside the container.

If you use online licensing (the most common form of licensing for on-premise products), you'll need to permit outbound HTTPS network traffic to license.deepgram.com on port 443.

If you use Docker Hub (recommended), you'll need to allow outbound traffic to Docker's servers on port 443.

CUDA

CUDA is NVIDIA's library for interacting with its GPU. Host machines will need to have the latest NVIDIA drivers installed. Drivers are available on NVIDIA's Driver Download site.

To test that the drivers are properly installed, run:

bash
$ nvidia-smi

CUDA support is made available to Docker containers using nvidia-docker, NVIDIA's custom runtime for Docker. To properly install nvidia-docker, please follow the nvidia-docker installation guide.

If you're using Docker Swarm, you must complete some additional configuration steps:

  1. Because selecting the nvidia runtime is not possible at the moment (this is a limitation of Docker's current Compose format/Swarm engine), you must change the default runtime for all containers, which is the currently accepted workaround.

    To change the default runtime, add --default-runtime=nvidia to the dockerd invocation in your systems service file that manages dockerd (e.g., for Ubuntu 18.04, /etc/systemd/system/docker.service).

  2. You must tell Docker that GPU resources exist. Again, this involves modifying the Docker Daemon system service definition. First, on each host, enumerate the GPU UUIDs:

    bash
    nvidia-smi -a | grep UUID | awk '{print substr($4,0,12)}'

    Then edit each host's service definition for dockerd to add --node-generic-resource gpu=UUID for each UUID returned by the previous command to the dockerd invocation. If you have multiple GPUs, include this switch multiple times.

  3. You must tell the nvidia-docker runtime to advertise GPU resources to the swarm. To do this, edit the /etc/nvidia-container-runtime/config.toml file and ensure that this line exists near the top of the file:

    bash
    swarm-resource = "DOCKER_RESOURCE_GPU"
  4. After performing any of the above steps, restart the Docker daemon.

Distributed Filesystem

If you will be operating Deepgram's products in a distributed environment (e.g., Docker Swarm), then you must ensure that required data artifacts and configuration files are available to each Docker container. In non-Swarm environments, this is often solved by Docker volumes, but in a distributed environment, you will have no control over where each container runs, so the artifacts and configuration files must be available on all hosts.

To do this, you can copy the files to all hosts (e.g., rsync), but a more reliable and convenient solution is to use a distributed file system, such as NFS or GlusterFS.

DGTools

Before installing DGTools, you should have the following:

  • Unix operating system
  • NVIDIA GPU(s). To learn more about recommended hardware for training, see Recommended Hardware.
  • CUDA support (nvidia-docker)
  • Access to deepgram/onprem Docker image
Was this section helpful?
Yes No