Microsoft Azure
With Docker/Podman
With Docker/Podman
Deploying Deepgram on Microsoft Azure requires some preparation. In this section, you will learn how to provision a Virtual Machine where you will deploy Deepgram products. You will need to perform some of these steps in the Azure Console and some in your local terminal.
You may have to request an Azure quota adjustment to be authorized to access Azure VM sizes that are powered by NVIDIA GPUs. As described in our hardware specifications, GPU-powered inference with Deepgram requires a NVIDIA GPU.
You can submit a quota request in the Azure portal. Search for Subscriptions, and select the desired subscription. In the navigation menu that appears, scroll to Settings and select Usage + quotas. In the top menu bar, select New Quota Request and fill out a request for your desired region and VM size. See Azure support documentation for up-to-date details on which VM sizes are powered by NVIDIA GPUs.
Make sure you have completed the requirements in the Self-Hosted Introduction.
GPU availability has been extremely limited across cloud providers, including Azure. You may need to contact Microsoft directly for access if you are not able to obtain a spot GPU instance.
To begin your self-hosted installation with Azure as your cloud service provider, you need to create a VM instance.
In the Azure portal, search for and select the Virtual Machines service. Then, in the top menu bar, click Create, then click Azure virtual machine.
The Azure virtual machine creation wizard includes multiple high-level groups/tabs. We will step through each.
For the Project Details wizard step, select your desired subscription to bill for this VM instance, and select or create a resource group. See Azure’s official documentation for more details on resource groups.
For the Instance details wizard step:
Type deepgram-self-hosted, or another appropriate description, for the VM name.
Choose your region, availability options, and security type according to your use case.
Select an image from our list of recommended Linux distributions. For VM architecture, make sure to choose the x64 version, as described in Architecture.
“Run with Azure Spot discount” is not recommended for services that need to be highly available.
For the size, click “See all sizes” and open the “N-Series” dropdown for Azure’s GPU powered VMs. Select a size that meets Deepgram’s minimum required hardware specifications.
Make sure to select a VM size that is powered by a NVIDIA GPU, such as the NCv3 series, NCasT4_v3 series, or NC A100 v4 series. Some Azure VM types, such as the NVv4 series, are powered by AMD GPUs. These will not work with Deepgram services at this time.
See Azure support documentation for up-to-date details on which VM sizes are powered by NVIDIA GPUs.
For the Administrator account wizard step, we recommend using an SSH public key for remote administration of the virtual machine. Fill out the username, SSH public key source, and other fields as needed.
As of Q4 2023, Azure only accepts RSA SSH keys. If you have an existing SSH key that is not RSA, you will need to create a new one.
For the Inbound port rules wizard step, you must allow the SSH (22) inbound port for remote administration as configured in the previous step. If you want to be able to receive requests from the public internet, make sure to also check the box to allow HTTPS traffic from the public internet.
Complete all other fields according to your use case.
Once you have reviewed all relevant details, click “Create”. If you requested Azure generate a new SSH keypair for you, it will prompt you to download the private key.
Azure will display the provisioning status of your VM. Once available, click “Go to resource” to view your VM details.
To complete the rest of the installation, including configuring your environment and transferring files between your local computer and your Azure VM instance, you must connect to the VM instance that you launched.
What’s Next
Now that we have provisioned a deployment environment, we need to start configuring it for inference.