The following are sample recommended hardware layouts. These different configurations can overlap on a physical machine: a system prepared for training can also be used for inference and labeling.
Purpose | Sample Recommended Hardware Layout |
---|---|
Inference (performing speech analysis on trained speech models) |
A system of this sort will typically provide a 50-100x real-time speedup. For higher throughput, please consult a Deepgram sales representative for a customized hardware recommendation. An Amazon EC2 p2.xlarge instance works well for a baseline deployment. A p3.2xlarge is a very cost-effective way to achieve substantially higher throughput. |
Training (creating new custom models) |
The recommended GPU is at least one V100 (the p3.2xlarge instance on Amazon EC2).Sizes do not include training data storage requirements. To train a model, approximately 3x the training data size is required. |
Labeling (creating new training datasets) |
Typically, an Amazon EC2 instance like the a1.large orāfor small transcription workloadsāan a1.medium will suffice.Sizes do not include training data storage requirements. |
Deepgram's software is intended to be run using Docker, a containerization technology used to isolate, secure, and deploy applications across many different host environments and operating systems. Containers are more light-weight than virtual machines, with many of the same isolation benefits.
Although Docker can be run from many different host operating systems, we recommend using Ubuntu 18.04 LTS or a similar Linux distribution, as we have tested our products most extensively in these OSes.
For the best scaling experience, or for high availability deployments, we also recommend a multi-host orchestration solution, such as Docker Swarm (recommended) or Kubernetes.
Ensure that Docker Engine is installed on all hosts, version 18.06 or later. Make sure the user you use is in thedocker
group, so that it has sufficient permissions to communicate with the Docker Daemon (system service).
To test that Docker is installed properly, run:
bash$ docker --version
If you will be using Docker Swarm, follow Docker's official guide to ensure that swarm mode is enabled on all hosts, that an appropriate number of master nodes have been configured, and that there are no networking issues.
To test your Swarm configuration, run the following from a master node:
bash$ docker node ls
Once you are satisfied that Docker is installed and configured correctly, cache your credentials locally by logging in to Docker Hub using the Docker credentials you created earlier:
bash$ docker login
This will make it will make it so that these commands should not need to be executed multiple times unless you log out (docker logout
).
Please be aware of any security concerns surrounding caching credentials.
Docker typically takes care of networking concerns, so as long as your firewall is configured to allow Docker (and Docker Swarm, where appropriate), you should have no special concerns.
Deepgram server containers typically listen on port 8080
inside the container.
If you use online licensing (the most common form of licensing for on-premise
products), you'll need to permit outbound HTTPS network traffic to license.deepgram.com
on port 443
.
If you use Docker Hub (recommended), you'll need to allow outbound traffic to
Docker's servers on port 443
.
Only applicable if you will be using GPU acceleration.
Support for CUDA 11 is not yet available, but coming soon.
CUDA is NVIDIA's library for interacting with its GPU. Host machines will need to have the latest NVIDIA drivers installed. Drivers are available on NVIDIA's Driver Download site.
To test that the drivers are properly installed, run:
bash$ nvidia-smi
CUDA support is made available to Docker containers using nvidia-docker
, NVIDIA's custom runtime for Docker. To properly install nvidia-docker
, please follow the nvidia-docker
installation guide.
If you're using Docker Swarm, you must complete some additional configuration steps:
Because selecting the nvidia
runtime is not possible at the moment (this is a limitation of Docker's current Compose format/Swarm engine), you must change the default runtime for all containers, which is the currently accepted workaround.
To change the default runtime, add --default-runtime=nvidia
to the dockerd
invocation in your systems service file that manages dockerd
(e.g., for Ubuntu 18.04, /etc/systemd/system/docker.service
).
Typical best-practice for systemd
is to create an override file, such as /etc/systemd/system/docker.service.d/override.conf
, rather than editing the primary docker.service
file directly.
You must tell Docker that GPU resources exist. Again, this involves modifying the Docker Daemon system service definition. First, on each host, enumerate the GPU UUIDs:
bashnvidia-smi -a | grep UUID | awk '{print substr($4,0,12)}'
Then edit each host's service definition for dockerd
to add --node-generic-resource gpu=UUID
for each UUID
returned by the previous command to the dockerd
invocation. If you have multiple GPUs, include this switch multiple times.
You must tell the nvidia-docker
runtime to advertise GPU resources
to the swarm. To do this, edit the /etc/nvidia-container-runtime/config.toml
file and ensure that this line exists near the top of the file:
bashswarm-resource = "DOCKER_RESOURCE_GPU"
After performing any of the above steps, restart the Docker daemon.
If you will be operating Deepgram's products in a distributed environment (e.g., Docker Swarm), then you must ensure that required data artifacts and configuration files are available to each Docker container. In non-Swarm environments, this is often solved by Docker volumes, but in a distributed environment, you will have no control over where each container runs, so the artifacts and configuration files must be available on all hosts.
To do this, you can copy the files to all hosts (e.g., rsync
), but a more reliable and convenient solution is to use a distributed file system, such as NFS or GlusterFS.
Before installing DGTools, you should have the following:
nvidia-docker
)deepgram/onprem
Docker image