For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Ask AIPlaygroundLoginFree API Key
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
    • Introduction
    • Deployment Environments
  • Amazon SageMaker
  • Docker/Podman
    • Drivers and Container Orchestration Tools
  • Kubernetes
    • Securing Your Cluster
    • Troubleshooting
  • Deployment
    • Self Service Licensing & Credentials
    • Deploy STT Services
    • Deploy Flux Model (STT)
    • Deploy TTS Services
    • Deploy Voice Agent
    • Status Endpoint
    • Certificate Status
  • Partner Deployment
  • Scaling and Deployment Strategies
    • System Maintenance
    • Blue-Green Deployment
    • Auto-Scaling
    • Metrics Guide
    • Ingress Authentication
    • Redact Usage
    • Log Formats
    • Using Private Container Registries
  • Features
    • Smart Formatting
  • Self-Hosted Add Ons
    • License Proxy
    • Prometheus Integration
    • Deepgram UniMRCP Plugin
    • Using SDKs with Self-Hosted
LogoLogo
Ask AIPlaygroundLoginFree API Key
On this page
  • Kubernetes Troubleshooting
  • Deepgram Helm Chart
  • Storage
  • Infrastructure
  • AI Troubleshooting
  • Prerequisites
  • General Prompt Guidelines
  • Example Troubleshooting Prompts
Kubernetes

Troubleshooting

If you encounter any challenges while deploying or maintaining your Deepgram self-hosted services on Kubernetes, please consult this guide.

Was this page helpful?
Previous

Self Service Licensing & Credentials

Learn how to deploy Deepgram self-hosted using self service. Deepgram’s self-serve self-hosted solution is available through both the Deepgram Console and the Deepgram API.

Next
Built with

Kubernetes Troubleshooting

Use this checklist to validate the health of your Kubernetes cluster running Deepgram self-hosted services.

Deepgram Helm Chart

  • Check the version of the Helm chart for Deepgram that you’re currently running on your Kubernetes cluster with helm list.
    • Ensure that you are running the latest version of the Helm chart for Deepgram
  • If the Deepgram engine Pod is not starting up succesfully, you may need to increase the startupProbe values
    • Modify the periodSeconds and failureThreshold values based on your tolerances

Storage

  • Ensure you don’t have too many files in your cloud filesystem for Deepgram model file storage (.dg files)
    • ⚠️ Having too many model files can slow down the engine pod startup significantly! We recommend testing your self-hosted setup with a handful of model files, and progressively add more if needed, to ensure that the engine is able to start up in a timely manner.
  • Check the metrics on your cloud storage volume (eg. Amazon EFS) to see if there is heavy utilization
    • This is not necessarily an issue in an of itself, but rather a symptom indicating that something else needs attention

Infrastructure

  • If the engine Pod is not starting up successfully, check the Pod logs for any errors.
    • Command: kubectl logs pod <pod-name>
  • Look at the Pod events and see if Kubernetes (kubelet) is killing the Pod before it has a chance to startup fully
    • Command: kubectl describe pod <pod-name>
  • Ensure that the engine Pod is scheduled on a compatible GPU node.
    • Command: kubectl describe pod <pod-name>
  • Run a test API call against the Deepgram API server
    • Command: kubectl run --namespace=dg-self-hosted --rm --stdin --tty --image=mcr.microsoft.com/dotnet/sdk --command test-client -- pwsh -Command Invoke-RestMethod -Uri 'http://deepgram-api-external.dg-self-hosted.svc.cluster.local:8080/models'

AI Troubleshooting

To simplify the process of identifying root causes and resolutions, for Deepgram services running on Kubernetes, you can use an AI client with Model Context Protocol (MCP) support.

Prerequisites

  • Install your AI client of choice. Here’s a few suggestions:
    • Claude Desktop
    • Cline Extension for VSCode
    • Continue.dev Extension for VSCode
    • OpenCode
  • Login to app, or set up a connection to a Large Language Model (LLM) service that supports tool calling. Deepgram has tested Claude 3.5 Haiku, a low-cost model.
    • Amazon Bedrock
    • OpenAI ChatGPT
    • Anthropic Claude
    • OpenRouter
    • Google Gemini AI Studio
  • Configure the Kubernetes MCP server in your AI client
  • Ensure you have the kubectl CLI installed locally
  • Ensure you’re authenticated and selected the correct Kubernetes cluster context from your local kubeconfig.yml file
    • Command: kubectl config get-contexts; kubectl config use-context <NAME>
    • Alternative: Use kubectx to list and switch contexts more easily

General Prompt Guidelines

  • Keep in mind that each AI client application has its own built-in prompt templates, and may behave differently.
  • We generally advise against enabling auto-approval for MCP tools that apply changes.
  • Enabling auto-approval for MCP tools that perform read-only operations is generally safe.
  • Use safe-guard statements in your prompts, such as “do not make any changes” in case your AI application is prone to attempt “fixing” things.

Example Troubleshooting Prompts

Once you’ve finished setting up your AI MCP client, you can use the following prompts to help identify issues in your Kubernetes cluster. Feel free to adapt any these prompts to your specific environment, such as updating the Kubernetes namespace you’ve deployed to, or provide additional, relevant context.

Check if you’re running the latest version of the Deepgram Helm chart:

Don’t make any changes to the kubernetes cluster. Check if I am running the latest version of the Deepgram self-hosted Helm chart. Use the currently selected context.

Make sure all the essential Deepgram pods exist:

Does my dg-self-hosted namespace have at least one API server, one engine, and one license proxy pod running?

Make sure essential Deepgram pods are running:

Are there any pods in the dg-self-hosted k8s namespace that are not running properly?

Check to see if the “engine” pod is being killed by the kubelet startup probe.

Get the pod details for the engine pod in the dg-self-hosted k8s namespace. Check to see if its startup probe is failing repeatedly.

Read logs from the Deepgram engine pod:

Check the logs for the Deepgram engine pod in the dg-self-hosted k8s namespace and see if there are any notable warnings or errors.

Ensure the Deepgram engine pod is scheduled on a system with at least a single NVIDIA GPU:

Make sure that the Deepgram engine pod on the kubernetes cluster is not scheduled on a node that has a fractional GPU. Do not make any changes to the cluster. Use the currently selected k8s context.