Troubleshooting
If you encounter any challenges while deploying or maintaining your Deepgram self-hosted services on Kubernetes, please consult this guide.
Kubernetes Troubleshooting
Use this checklist to validate the health of your Kubernetes cluster running Deepgram self-hosted services.
Deepgram Helm Chart
- Check the version of the Helm chart for Deepgram that youโre currently running on your Kubernetes cluster with
helm list
.- Ensure that you are running the latest version of the Helm chart for Deepgram
- If the Deepgram
engine
Pod is not starting up succesfully, you may need to increase thestartupProbe
values- Modify the
periodSeconds
andfailureThreshold
values based on your tolerances
- Modify the
Storage
- Ensure you donโt have too many files in your cloud filesystem for Deepgram model file storage (
.dg
files)- โ ๏ธ Having too many model files can slow down the
engine
pod startup significantly! We recommend testing your self-hosted setup with a handful of model files, and progressively add more if needed, to ensure that theengine
is able to start up in a timely manner.
- โ ๏ธ Having too many model files can slow down the
- Check the metrics on your cloud storage volume (eg. Amazon EFS) to see if there is heavy utilization
- This is not necessarily an issue in an of itself, but rather a symptom indicating that something else needs attention
Infrastructure
- If the
engine
Pod is not starting up successfully, check the Pod logs for any errors.- Command:
kubectl logs pod <pod-name>
- Command:
- Look at the Pod events and see if Kubernetes (kubelet) is killing the Pod before it has a chance to startup fully
- Command:
kubectl describe pod <pod-name>
- Command:
- Ensure that the
engine
Pod is scheduled on a compatible GPU node.- Command:
kubectl describe pod <pod-name>
- Command:
- Run a test API call against the Deepgram API server
- Command:
kubectl run --namespace=dg-self-hosted --rm --stdin --tty --image=mcr.microsoft.com/dotnet/sdk --command test-client -- pwsh -Command Invoke-RestMethod -Uri 'http://deepgram-api-external.dg-self-hosted.svc.cluster.local:8080/models'
- Command:
AI Troubleshooting
To simplify the process of identifying root causes and resolutions, for Deepgram services running on Kubernetes, you can use an AI client with Model Context Protocol (MCP) support.
Prerequisites
- Install your AI client of choice. Hereโs a few suggestions:
- Login to app, or set up a connection to a Large Language Model (LLM) service that supports tool calling. Deepgram has tested Claude 3.5 Haiku, a low-cost model.
- Configure the Kubernetes MCP server in your AI client
- Ensure you have the
kubectl
CLI installed locally - Ensure youโre authenticated and selected the correct Kubernetes cluster context from your local
kubeconfig.yml
file- Command:
kubectl config get-contexts; kubectl config use-context <NAME>
- Alternative: Use kubectx to list and switch contexts more easily
- Command:
General Prompt Guidelines
- Keep in mind that each AI client application has its own built-in prompt templates, and may behave differently.
- We generally advise against enabling auto-approval for MCP tools that apply changes.
- Enabling auto-approval for MCP tools that perform read-only operations is generally safe.
- Use safe-guard statements in your prompts, such as โdo not make any changesโ in case your AI application is prone to attempt โfixingโ things.
Example Troubleshooting Prompts
Once youโve finished setting up your AI MCP client, you can use the following prompts to help identify issues in your Kubernetes cluster. Feel free to adapt any these prompts to your specific environment, such as updating the Kubernetes namespace youโve deployed to, or provide additional, relevant context.
Check if youโre running the latest version of the Deepgram Helm chart:
Donโt make any changes to the kubernetes cluster. Check if I am running the latest version of the Deepgram self-hosted Helm chart. Use the currently selected context.
Make sure all the essential Deepgram pods exist:
Does my dg-self-hosted namespace have at least one API server, one engine, and one license proxy pod running?
Make sure essential Deepgram pods are running:
Are there any pods in the dg-self-hosted k8s namespace that are not running properly?
Check to see if the โengineโ pod is being killed by the kubelet startup probe.
Get the pod details for the engine pod in the dg-self-hosted k8s namespace. Check to see if its startup probe is failing repeatedly.
Read logs from the Deepgram engine pod:
Check the logs for the Deepgram engine pod in the dg-self-hosted k8s namespace and see if there are any notable warnings or errors.
Ensure the Deepgram engine
pod is scheduled on a system with at least a single NVIDIA GPU:
Make sure that the Deepgram engine pod on the kubernetes cluster is not scheduled on a node that has a fractional GPU. Do not make any changes to the cluster. Use the currently selected k8s context.