Deployment of on-premise Deepgram products is typically done using a Docker service orchestration technology (Docker Compose or Docker Swarm).
To deploy the Speech API and Speech Engine services using Docker Swarm, you can use the following sample Compose file.
version: "3.7"
services:
# The speech API service.
api:
image: deepgram/onprem-api:latest
deploy:
mode: replicated
replicas: 1
# Here we expose the API port to the host machine (or, in the case of
# Docker Swarm, this will expose it to all nodes in the swarm via the
# ingress overlay network). The container port (right-hand side) must match
# the port that the API service is listening on (from its configuration
# file).
ports:
- "8080:8080"
volumes:
# The API configuration file needs to be accessible. For Docker Swarm,
# you will need this path to be accessible to all nodes in the swarm; we
# recommend using a distributed filesystem in this case.
- "/path/on/host/api.toml:/api.toml:ro"
# Invoke the API server, passing it the path (inside the container) to its
# configuration file.
command: -v serve /api.toml
# The speech engine service.
engine:
image: deepgram/onprem-engine:latest
deploy:
mode: replicated
# Feel free to increase the number of replicas to match your GPU
# resources.
replicas: 1
# Assuming you have correctly configured Docker and nvidia-docker to
# advertise to allocate GPU resources, you can request that each Engine
# instance use a specific number of GPUs (here, 2).
resources:
reservations:
generic_resources:
- discrete_resource_spec:
kind: gpu
value: 2
# These configuration files and models need to be accessible. For Docker
# Swarm, you will need these path to be accessible to all nodes in the
# swarm; we recommend using a distributed filesystem in this case.
#
# There is no reason the configuration file and models can't be combined
# into a single volume mount.
volumes:
# The Engine configuration file needs to be accessible.
- "/path/on/host/engine.toml:/engine.toml:ro"
# The models will also need to be available. The in-container paths must
# match the paths specified in the Engine configuration file.
- "/path/on/host/models:/engine:ro"
# Invoke the Engine service, passing it the path (inside the container) to
# its configuration file.
command: -v serve /engine.toml
# The metrics service.
metrics-server:
image: deepgram/metrics-server:latest
deploy:
mode: replicated
replicas: 1
volumes:
# The configuration file needs to be accessible.
- "/path/on/host/metrics-server.toml:/metrics-server.toml:ro"
# Invoke the metrics server, passing it the path (inside the container) to
# its configuration file.
command: /metrics-server.toml
# Hotpepper, the on-premise human transcription/labeling tool for creating training
# sets. (Note: Naming conventions may mention Dashscript, a previous version
# of Hotpepper; they are the same tool.)
dashscript:
image: deepgram/onprem-dashscript:latest
deploy:
mode: replicated
replicas: 1
# Here we expose the service port to the host machine. The container port
# (right-hand side) must match the port that the service is listening
# on (from its configuration file).
ports:
- "8081:80"
# An admin "super-user" must be present in the database to execute user and dataset
# management actions through the app. Setting these environment variables will
# ensure that the app starts with at least one.
environment:
DASHSCRIPT_ADMIN_USER: "admin"
DASHSCRIPT_ADMIN_PASSWORD: "admin_pass"
volumes:
# The container paths (right side) should align with those used in the Hotpepper
# config file.
# Path to the Hotpepper database directory. The name of the database will be
# pulled from the Hotpepper config file.
- "/path/to/database:/db"
# Path to the directory containing input datasets. New datasets are
# created by adding subdirectories to this folder and placing audio data
# there. This directory should be structured like so:
# /path/to/input/data/
# |_ dataset1/
# |_ audio1.mp3
# |_ audio2.mp3
# ...
- "/path/to/input/data:/datasets"
# Path to where Hotpepper
# will save its finalized, packaged datasets.
- "/path/to/output/data:/packaged"
# Path to the Hotpepper config file.
- "/path/to/config/file.toml:/config.toml:ro"
# Invoke Hotpepper, giving it the path to the config file (in the
# container).
command: /config.toml serve
To deploy the Speech API and Speech Engine services using Docker Compose, you can use the following sample Compose file.
version: "2.4"
services:
# The speech API service.
api:
image: deepgram/onprem-api:latest
# Here we expose the API port to the host machine. The container port
# (right-hand side) must match the port that the API service is listening
# on (from its configuration file).
ports:
- "8080:8080"
volumes:
# The API configuration file needs to be accessible; this should point to
# the file on the host machine.
- "/path/on/host/api.toml:/api.toml:ro"
# Invoke the API server, passing it the path (inside the container) to its
# configuration file.
command: -v serve /api.toml
# The speech engine service.
engine:
image: deepgram/onprem-engine:latest
# Change the default runtime.
runtime: nvidia
# These configuration files and models need to be accessible; these paths
# should point to files/directories on the host machine.
#
# There is no reason the configuration file and models can't be combined
# into a single volume mount.
volumes:
# The Engine configuration file needs to be accessible.
- "/path/on/host/engine.toml:/engine.toml:ro"
# The models will also need to be available. The in-container paths must
# match the paths specified in the Engine configuration file.
- "/path/on/host/models:/engine:ro"
# Invoke the Engine service, passing it the path (inside the container) to
# its configuration file.
command: -v serve /engine.toml
# The metrics service.
metrics-server:
image: deepgram/metrics-server:latest
volumes:
# The configuration file needs to be accessible.
- "/path/on/host/metrics-server.toml:/metrics-server.toml:ro"
# When run with no command line arguments, the metrics server may
# be configured using environment variables.
environment:
SERVER_ADDRESS: "0.0.0.0:8000"
# Hotpepper, the on-premise human transcription/labeling tool for creating training
# sets. (Note: Naming conventions may mention Dashscript, a previous version
# of Hotpepper; they are the same tool.)
dashscript:
image: deepgram/onprem-dashscript:latest
# Here we expose the service port to the host machine. The container port
# (right-hand side) must match the port that the service is listening
# on (from its configuration file).
ports:
- "8081:80"
# An admin "super-user" must be present in the database to execute user and dataset
# management actions through the app. Setting these environment variables will
# ensure that the app starts with at least one.
environment:
DASHSCRIPT_ADMIN_USER: "admin"
DASHSCRIPT_ADMIN_PASSWORD: "admin_pass"
volumes:
# The container paths (right side) should align with those used in the Hotpepper
# config file.
# Path to the Hotpepper database directory. The name of the database will be
# pulled from the Hotpepper config file.
- "/path/to/database:/db"
# Path to the directory containing input datasets. New datasets are
# created by adding subdirectories to this folder and placing audio data
# there. This directory should be structured like so:
# /path/to/input/data/
# |_ dataset1/
# |_ audio1.mp3
# |_ audio2.mp3
# ...
- "/path/to/input/data:/datasets"
# Path to where Hotpepper
# will save its finalized, packaged datasets.
- "/path/to/output/data:/packaged"
# Path to the Hotpepper config file.
- "/path/to/config/file.toml:/config.toml:ro"
# Invoke Hotpepper, giving it the path to the config file (in the
# container).
command: /config.toml serve
Before starting the deployment, make sure:
Assuming the Compose file (provided in the previous section) is saved to
/path/to/compose.yml
, run:
$ docker-compose -f /path/to/compose.yml pull
$ docker-compose -f /path/to/compose.yml -p deepgram up -d
Assuming the Compose file (provided in the previous section) is saved to
/path/to/compose.yml
, run:
$ docker stack deploy -c /path/to/compose.yml --with-registry-auth deepgram
Although generally you should refer to Docker's documentation to learn how to query the state of the Docker ecosystem, some useful commands are as follows:
$ docker-compose ps
$ docker ps
$ docker stats
$ docker stack ps deepgram
$ docker service ps deepgram_api
$ docker service ps deepgram_engine
To ensure that everything is working, you can send end-to-end audio data
through the pipeline. In this example, we assume that your API endpoint is exposed at localhost
on port 8080
; please change the set according to your network topography.
If you have local audio data available, run:
$ curl -X POST -T /path/to/audio.wav -H "Expect:" http://localhost:8080/v2/listen
If you permit outgoing connections to the public internet, you can also test on remotely-hosted media:
$ curl -H 'Content-type: application/json' -X POST -d '{"url": "https://deepgram.com/examples/interview_speech-analytics.wav"}' -H "Expect:" http://localhost:8080/v2/listen
If you are receiving transcripts, congratulations! If not, add a verbose flag (-v
) to each curl
command, and see if the status code and response body can help you identify the error.
If you are still having problems, here are some common tests:
Test | Method |
---|---|
Are the Docker containers running? | docker ps docker-compose ps docker service ps ... |
Are there obvious problems in the logs? | docker logs docker-compose logs docker service logs ... |
Can you reach the API port? | nc HOST PORT telnet HOST PORT netstat -tunap ) |
Is the Docker network functioning correctly, including domain name resolution? | Most easily tested by attaching a test container to the appropriate Docker network and querying DNS. First, determine which Docker network is appropriate ( docker network ls ) and ensure it is attachable. If it is not, consult the Docker documentation for details on how to configure your Docker networks. Spawn an interactive container (docker run --rm -it --network=NETWORK alpine ). Install a DNS client (apk add drill ) and use it to query DNS (drill HOST ). Ensure that the domain name server you are querying (see the drill documentation for details) matches the resolver (if any) in your API configuration. |
If you publish the speech engine's port, can you connect to it? |