Deploying Deepgram on Google Cloud Platform (GCP) requires some preparation. In this section, you will learn how to expose and access your application in a secure and stable manner. You will need to perform some of these steps in the GCP Cloud Console and some in your local terminal.
Make sure you have completed the requirements in the On-prem Introduction.
GPU availability has been extremely limited across cloud providers, including GCP. You may need to contact Google directly for access if you are not able to obtain a spot GPU instance.
To begin your on-prem installation with GCP as your cloud service provider, you need to create a Compute Engine VM instance.
Create a new GCP project with an attached billing account. See Google's documentation for more details.
In the Google Cloud Console, open the Navigation Menu and select Compute Engine -> VM Instances.
- If you have not used the Compute Engine API on this project, you may need to click Enable to enable Compute Engine on the project.
Click Create Instance to pull up the VM creation wizard.
For the Name wizard step, type
deepgram-on-premises, or another appropriate description of this deployment.
For the Region wizard step, choose your desired region and zone.
For the Machine configuration wizard step, filter by "GPU". For each of the following sub-steps, follow the requirements listed in our minimum required hardware specifications.
- For "GPU type", you may select any of the available NVIDIA GPUs.
- For "Number of GPUs", customize as desired. Deepgram services can be configured to make use of up to 8 GPUs with a single Engine container.
- Do not enable Virtual Workstation (NVIDIA GRID).
- For "Machine type", customize the number of cores and memory available to meet Deepgram's hardware specifications, linked above.
- Customize the "Advanced Configurations" if desired.
- For "Availability policies", choose a policy that makes sense for your use case.
- For "Display device", it is not necessary to enable a display device for Deepgram services to run.
For the Confidential VM service wizard step, make a choice that makes sense for your use case.
For the Container wizard step, we will not select "Deploy container", as we will deploy multiple containers to a single VM as a POC. In a production environment, you may want to deploy a single container to the VM, in which case you may choose to utilize this option.
For the Boot Disk wizard step, select "Change".
- Select an "Operating system" and "Version" from our list of recommended Linux distributions. Make sure to choose the x86 version, as described in Platforms.
- Select a "Boot disk type" that makes sense for your use case.
- For "Size (GB)", select a value that meets Deepgram's minimum required hardware specifications.
For the Identity and API access wizard step, select a Service Account with the proper permissions and scopes according to your use case. If you aren't sure which to use, you may use the default service account for a POC.
For the Firewall wizard step, if you want to be able to receive requests from the public internet, make sure to also check the box
Allow HTTPS traffic. If you are colocating your Deepgram on-prem deployment with other services and do not need to expose your server to the public internet, you may not need to check any of the available options.
For the Observability wizard step, you may install the Ops Agent if you wish for your own monitoring and logging. The Ops Agent will provide a native solution to ingest Deepgram container logs, as well as OpenTelemetry metrics from Deepgram Engine.
Createto launch your VM.
To complete the rest of the installation, including configuring your environment and transferring files between your local computer and your VM instance, you must connect to the instance that you launched.
On the VM instances page in GCP, click on the name of your recently deployed VM to open its details page.
In the top left, click on the SSH button. You may need to authorize the connection the first time you do this.
- This will open an SSH window in-browser. If you wish to connect via your own SSH client, see the GCP documentation on adding your own SSH keys to the VM.
Updated about 2 months ago
Now that we have provisioned a deployment environment, we need to start configuring it for inference.