Amazon Connect and Deepgram

Amazon Connect is a popular platform for hosting cloud contact centers. Our integration enables you to transcribe your Connect calls in real-time with Deepgram.

In this guide, we'll explain how to spin up the integration in your AWS environment and build a contact flow with real-time transcription

Before you Begin

Create a Deepgram Account

Create a Deepgram Account

Before you can use Deepgram products, you'll need to create a Deepgram account. Signup is free and includes:

Create a Deepgram API Key

To access Deepgram’s API, you'll need to create a Deepgram API Key. Make note of your API Key; you will need it later.

Set Up an Amazon Connect Instance

You'll also need an Amazon Connect instance that is configured to receive incoming calls. This guide walks you through the process.

Architecture Overview

Architecture Overview

This diagram shows how the integration works at a high level.

  1. A customer calls into your Amazon Connect call center.
  2. The customer enters a contact flow.
  3. Within the contact flow, you use a "Start Media Streaming" block to begin sharing call audio with Kinesis Video Streams (KVS).
  4. You set contact attributes to configure Deepgram. These give you access to the full Deepgram streaming API, including features like smart formatting and interim results. This is also where you set the callback URL where you want to receive the transcripts.
  5. The contact flow then invokes a Lambda function, which we call the "trigger Lambda" or kvs_dg_trigger.
  6. The Lambda function makes a POST request to a Fargate task, telling it to kick off a session.
    The Fargate task is called kvs_dg_integrator. It is really a cluster of Fargate tasks running on ECS behind a load balancer.
  7. For the duration of the call, the Fargate task pulls the call audio from KVS and passes it along to Deepgram.
  8. Deepgram transcribes the audio and POSTs the transcripts to your callback URL in real-time.

The core of this integration is a CloudFormation template, which spins up the Lambda function and Fargate cluster in your environment.

Deploy the Integration


  1. Install the Docker CLI. If you're on Windows or Mac and run into issues with licensing requirements, Podman can be used as a nearly drop-in replacement.
  2. Install the AWS CLI. Make sure it's configured to use the same region as your Connect instance.
  3. Clone the integration repo from GitHub.

Prepare the Docker Images

  1. Create these two ECR repos in your AWS environment:

    aws ecr create-repository --repository-name kvs-dg-trigger
    aws ecr create-repository --repository-name kvs-dg-integrator
  2. Log in to ECR with the Docker CLI. This will enable you to push images to the repos.

    aws ecr get-login-password --region <YOUR-REGION> | docker login --username AWS --password-stdin <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION> 
  3. In the kvs_dg_trigger folder in the integration repo, build the Docker image for the trigger Lambda:

    docker build --platform linux/amd64 -t <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION> .
  4. In the kvs_dg_integrator folder, build the Docker image for the integrator task:

    docker build --platform linux/amd64 -t <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION> .
  5. Push the new Docker images to ECR:

    docker push <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>
    docker push <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>

Spin Up the CloudFormation Stack

  1. Go to CloudFormation in the AWS Console. Make sure you're in the same region as your Connect instance, then click Create stack > With new resources.
  2. Under Specify template, select Upload a template file.
  3. Click Choose file and pick cloudformation.yaml from the Git repo you cloned.
  4. Give the stack a name such as deepgram-connect-integration.
  5. If you're running Deepgram on-prem, change the Deepgram API field to the URL where your on-prem instance is deployed. Otherwise leave it as the default.
  6. Paste your API key under Deepgram API Key.
  7. Select the VPC ID and Subnets where you want to deploy the integration.
    1. The subnets need outbound internet access in order to pull the task image from ECR. This means you'll need to use either public subnets, or private subnets with access to a NAT gateway. Users who want a more isolated setup can edit the CloudFormation template to use PrivateLink to eliminate the need for internet access, but as of today the integration doesn't support this out-of-the-box.
  8. Under Trigger Lambda > Image URI, paste in the image you just pushed to ECR: <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>
  9. Under Integrator ECS Service > Task Image URI, paste in the other image you pushed: <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>
  10. Adjust Desired Task Count, Task CPU, and Task Memory.
    1. For testing out the integration, or if you expect minimal load, Desired Task Count=1, Task CPU=256, and Task Memory=1024 will work fine. Using Fargate prices at the time of writing, a task at these settings will cost ~$0.015 per hour, or ~$10.84 per month.
    2. For any significant load, you will likely want at least a full vCPU, something like Task CPU=1024, and Task Memory=4096, which comes out to ~$0.058 per hour, or ~$43.35 per month. We've found these settings will support up to 200 concurrent calls.
    3. Whatever values you choose, a good rule of thumb is to keep memory at 4 times CPU in order to utilize both resources to the fullest while avoiding OOMs.


This integration favors a simple and easy-to-deploy setup, but it is not aggressively optimized for cost. There are a couple of measures you can take to reduce Fargate cost at the expense of some higher complexity:

  • Run the integrator on EC2 instead of Fargate. This will be cheaper but require increased effort to manage the EC2 instances.
  • Attach an auto-scaling policy to the ECS service, so that you're not paying to support peak load during off-hours. This repository is a good starting point for doing auto-scaling with Fargate in CloudFormation.
  1. Click through the Next buttons and then Submit the stack for creation
  2. Once everything completes successfully, the integration is deployed and available for use by your Connect instance.

Run the Sample Contact Flow

The GitHub repo also includes an Amazon Connect contact flow that demonstrates how to use the deployed integration. To run the contact flow, follow these steps:

  1. Add the newly deployed trigger Lambda to your Amazon Connect instance (guide).
  2. Enable live media streaming in your Amazon Connect instance (guide). Choosing No data retention is fine, unless you plan to load test the integration with the built-in load testing functionality, in which case you should choose a retention period at least as long as your load test duration.
  3. Create a new inbound contact flow and import sample_contact_flow.json (guide).
  4. In the Deepgram Configuration block, go to Edit Settings and update dg_callback to a URL where you want to receive the transcripts. The transcripts will be sent as POST requests to the URL you provide. For testing, you can use a site like Beeceptor to create a URL that will display the contents of the POST requests.
  5. In the Invoke Trigger Lambda block, go to Edit Settings and select the trigger Lambda function from the Function ARN dropdown.
  6. Save and publish the contact flow.
  7. Assign the contact flow to a phone number (guide).
  8. Call into the phone number. After it plays the initial message, say something and watch your callback server to make sure your words are being transcribed.

Configure Deepgram

When transcribing Connect calls, you can use any of the features of Deepgram's streaming API. You select features in the streaming API by using a Set contact attributes block in the contact flow. These attributes must be set before invoking the trigger Lambda, since they will be passed to the Lambda, which will pass them to the integrator, which will ultimately pass them to Deepgram.

The sample contact flow described above includes some basic Deepgram configuration. Let's now look at an example similar to what you will find in the sample flow.

 Deepgram Contact Attributes

Setting Deepgram Contact Attributes

This image shows 3 Deepgram features being set:

  • dg_model is set to nova. This adds model=nova to the query parameters of the Deepgram request.
  • dg_tag is set to someTag someOtherTag. This adds tag=someTag&tag=someOtherTag to the query parameters of the Deepgram request.
  • dg_callback is set to{contact-id}. As you would expect, this adds a callback to the query parameters of the Deepgram request--but it also injects the contact ID of the current call. In other words, if a Connect call has contact ID 002f61e1-423e-415d-b086-697186514860, then the transcripts for that call will be POSTed to Injecting contact IDs like this is only possible within the dg_callback contact attribute. This enables you to associate transcripts with calls.


There are a few features in the Deepgram streaming API that you should not attempt to set in the contact flow. These are:

  • encoding
  • sample_rate
  • multichannel
  • channels

The integrator already knows the encoding and sample_rate, and we lock multichannel=true and channels=2 so that you can receive transcripts for both sides of the call. If you try to manually set any of these values, the session will fail.