Kubernetes Troubleshooting

Use this checklist to validate the health of your Kubernetes cluster running Deepgram self-hosted services.

Deepgram Helm Chart

Check the version of the Helm chart for Deepgram that you’re currently running on your Kubernetes cluster with helm list.
- Ensure that you are running the latest version of the Helm chart for Deepgram
If the Deepgram engine Pod is not starting up succesfully, you may need to increase the startupProbe values
- Modify the periodSeconds and failureThreshold values based on your tolerances

Storage

Ensure you don’t have too many files in your cloud filesystem for Deepgram model file storage (.dg files)
- ⚠️ Having too many model files can slow down the engine pod startup significantly! We recommend testing your self-hosted setup with a handful of model files, and progressively add more if needed, to ensure that the engine is able to start up in a timely manner.
Check the metrics on your cloud storage volume (eg. Amazon EFS) to see if there is heavy utilization
- This is not necessarily an issue in an of itself, but rather a symptom indicating that something else needs attention

Infrastructure

If the engine Pod is not starting up successfully, check the Pod logs for any errors.
- Command: kubectl logs pod <pod-name>
Look at the Pod events and see if Kubernetes (kubelet) is killing the Pod before it has a chance to startup fully
- Command: kubectl describe pod <pod-name>
Ensure that the engine Pod is scheduled on a compatible GPU node.
- Command: kubectl describe pod <pod-name>
Run a test API call against the Deepgram API server
- Command: kubectl run --namespace=dg-self-hosted --rm --stdin --tty --image=mcr.microsoft.com/dotnet/sdk --command test-client -- pwsh -Command Invoke-RestMethod -Uri 'http://deepgram-api-external.dg-self-hosted.svc.cluster.local:8080/models'

AI Troubleshooting

To simplify the process of identifying root causes and resolutions, for Deepgram services running on Kubernetes, you can use an AI client with Model Context Protocol (MCP) support.

Prerequisites

Install your AI client of choice. Here’s a few suggestions:
Login to app, or set up a connection to a Large Language Model (LLM) service that supports tool calling. Deepgram has tested Claude 3.5 Haiku, a low-cost model.
Configure the Kubernetes MCP server in your AI client
Ensure you have the kubectl CLI installed locally
Ensure you’re authenticated and selected the correct Kubernetes cluster context from your local kubeconfig.yml file
- Command: kubectl config get-contexts; kubectl config use-context <NAME>
- Alternative: Use kubectx to list and switch contexts more easily

General Prompt Guidelines

Keep in mind that each AI client application has its own built-in prompt templates, and may behave differently.
We generally advise against enabling auto-approval for MCP tools that apply changes.
Enabling auto-approval for MCP tools that perform read-only operations is generally safe.
Use safe-guard statements in your prompts, such as “do not make any changes” in case your AI application is prone to attempt “fixing” things.

Example Troubleshooting Prompts

Once you’ve finished setting up your AI MCP client, you can use the following prompts to help identify issues in your Kubernetes cluster. Feel free to adapt any these prompts to your specific environment, such as updating the Kubernetes namespace you’ve deployed to, or provide additional, relevant context.

Check if you’re running the latest version of the Deepgram Helm chart:

Don’t make any changes to the kubernetes cluster. Check if I am running the latest version of the Deepgram self-hosted Helm chart. Use the currently selected context.

Make sure all the essential Deepgram pods exist:

Does my dg-self-hosted namespace have at least one API server, one engine, and one license proxy pod running?

Make sure essential Deepgram pods are running:

Are there any pods in the dg-self-hosted k8s namespace that are not running properly?

Check to see if the “engine” pod is being killed by the kubelet startup probe.

Get the pod details for the engine pod in the dg-self-hosted k8s namespace. Check to see if its startup probe is failing repeatedly.

Read logs from the Deepgram engine pod:

Check the logs for the Deepgram engine pod in the dg-self-hosted k8s namespace and see if there are any notable warnings or errors.

Ensure the Deepgram engine pod is scheduled on a system with at least a single NVIDIA GPU:

Make sure that the Deepgram engine pod on the kubernetes cluster is not scheduled on a node that has a fractional GPU. Do not make any changes to the cluster. Use the currently selected k8s context.