Amazon Connect and Deepgram
Amazon Connect is a popular platform for hosting cloud contact centers. Our integration enables you to transcribe your Connect calls in real-time with Deepgram.
In this guide, we'll explain how to spin up the integration in your AWS environment and build a contact flow with real-time transcription
Before you Begin
Before you can use Deepgram, you'll need to create a Deepgram account. Signup is free and includes $200 in free credit and access to all of Deepgram's features!
Before you start, you'll need to follow the steps in the Make Your First API Request guide to obtain a Deepgram API key, and configure your environment if you are choosing to use a Deepgram SDK.
Set Up an Amazon Connect Instance
You'll also need an Amazon Connect instance that is configured to receive incoming calls. This guide walks you through the process.
Architecture Overview
This diagram shows how the integration works at a high level.
- A customer calls into your Amazon Connect call center.
- The customer enters a contact flow.
- Within the contact flow, you use a "Start Media Streaming" block to begin sharing call audio with Kinesis Video Streams (KVS).
- You set contact attributes to configure Deepgram. These give you access to the full Deepgram streaming API, including features like smart formatting and interim results. This is also where you set the callback URL where you want to receive the transcripts.
- The contact flow then invokes a Lambda function, which we call the "trigger Lambda" or
kvs_dg_trigger
. - The Lambda function makes a POST request to a Fargate task, telling it to kick off a session.
The Fargate task is calledkvs_dg_integrator
. It is really a cluster of Fargate tasks running on ECS behind a load balancer. - For the duration of the call, the Fargate task pulls the call audio from KVS and passes it along to Deepgram.
- Deepgram transcribes the audio and POSTs the transcripts to your callback URL in real-time.
The core of this integration is a CloudFormation template, which spins up the Lambda function and Fargate cluster in your environment.
Deploy the Integration
Prerequisites
- Install the Docker CLI. If you're on Windows or Mac and run into issues with licensing requirements, Podman can be used as a nearly drop-in replacement.
- Install the AWS CLI. Make sure it's configured to use the same region as your Connect instance.
- Clone the integration repo from GitHub.
Prepare the Docker Images
-
Create these two ECR repos in your AWS environment:
aws ecr create-repository --repository-name kvs-dg-trigger aws ecr create-repository --repository-name kvs-dg-integrator
-
Log in to ECR with the Docker CLI. This will enable you to push images to the repos.
aws ecr get-login-password --region <YOUR-REGION> | docker login --username AWS --password-stdin <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>.amazonaws.com
-
In the
kvs_dg_trigger
folder in the integration repo, build the Docker image for the trigger Lambda:docker build --platform linux/amd64 -t <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>.amazonaws.com/kvs-dg-trigger:latest .
-
In the
kvs_dg_integrator
folder, build the Docker image for the integrator task:docker build --platform linux/amd64 -t <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>.amazonaws.com/kvs-dg-integrator:latest .
-
Push the new Docker images to ECR:
docker push <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>.amazonaws.com/kvs-dg-trigger:latest docker push <YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>.amazonaws.com/kvs-dg-integrator:latest
Spin Up the CloudFormation Stack
- Go to CloudFormation in the AWS Console. Make sure you're in the same region as your Connect instance, then click Create stack > With new resources.
- Under Specify template, select Upload a template file.
- Click Choose file and pick
cloudformation.yaml
from the Git repo you cloned. - Give the stack a name such as
deepgram-connect-integration
. - If you're self-hosting Deepgram, change the Deepgram API field to the URL where your self-hosted instance is deployed. Otherwise leave it as the default.
- Paste your API key under Deepgram API Key.
- Select the VPC ID and Subnets where you want to deploy the integration.
- The subnets need outbound internet access in order to pull the task image from ECR. This means you'll need to use either public subnets, or private subnets with access to a NAT gateway. Users who want a more isolated setup can edit the CloudFormation template to use PrivateLink to eliminate the need for internet access, but as of today the integration doesn't support this out-of-the-box.
- Under Trigger Lambda > Image URI, paste in the image you just pushed to ECR:
<YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>.amazonaws.com/kvs-dg-trigger:latest
. - Under Integrator ECS Service > Task Image URI, paste in the other image you pushed:
<YOUR-ACCOUNT-NUMBER>.dkr.ecr.<YOUR-REGION>.amazonaws.com/kvs-dg-integrator:latest
. - Adjust Desired Task Count, Task CPU, and Task Memory.
- For testing out the integration, or if you expect minimal load, Desired Task Count=
1
, Task CPU=256
, and Task Memory=1024
will work fine. Using Fargate prices at the time of writing, a task at these settings will cost ~$0.015 per hour, or ~$10.84 per month. - For any significant load, you will likely want at least a full vCPU, something like Task CPU=
1024
, and Task Memory=4096
, which comes out to ~$0.058 per hour, or ~$43.35 per month. We've found these settings will support up to 200 concurrent calls. - Whatever values you choose, a good rule of thumb is to keep memory at 4 times CPU in order to utilize both resources to the fullest while avoiding OOMs.
- For testing out the integration, or if you expect minimal load, Desired Task Count=
This integration favors a simple and easy-to-deploy setup, but it is not aggressively optimized for cost. There are a couple of measures you can take to reduce Fargate cost at the expense of some higher complexity:
- Run the integrator on EC2 instead of Fargate. This will be cheaper but require increased effort to manage the EC2 instances.
- Attach an auto-scaling policy to the ECS service, so that you're not paying to support peak load during off-hours. This repository is a good starting point for doing auto-scaling with Fargate in CloudFormation.
- Click through the Next buttons and then Submit the stack for creation
- Once everything completes successfully, the integration is deployed and available for use by your Connect instance.
Run the Sample Contact Flow
The GitHub repo also includes an Amazon Connect contact flow that demonstrates how to use the deployed integration. To run the contact flow, follow these steps:
- Add the newly deployed trigger Lambda to your Amazon Connect instance (guide).
- Enable live media streaming in your Amazon Connect instance (guide). Choosing No data retention is fine, unless you plan to load test the integration with the built-in load testing functionality, in which case you should choose a retention period at least as long as your load test duration.
- Create a new inbound contact flow and import
sample_contact_flow.json
(guide). - In the Deepgram Configuration block, go to Edit Settings and update
dg_callback
to a URL where you want to receive the transcripts. The transcripts will be sent as POST requests to the URL you provide. For testing, you can use a site like Beeceptor to create a URL that will display the contents of the POST requests. - In the Invoke Trigger Lambda block, go to Edit Settings and select the trigger Lambda function from the Function ARN dropdown.
- Save and publish the contact flow.
- Assign the contact flow to a phone number (guide).
- Call into the phone number. After it plays the initial message, say something and watch your callback server to make sure your words are being transcribed.
Configure Deepgram
When transcribing Connect calls, you can use any of the features of Deepgram's streaming API. You select features in the streaming API by using a Set contact attributes block in the contact flow. These attributes must be set before invoking the trigger Lambda, since they will be passed to the Lambda, which will pass them to the integrator, which will ultimately pass them to Deepgram.
The sample contact flow described above includes some basic Deepgram configuration. Let's now look at an example similar to what you will find in the sample flow.
This image shows 3 Deepgram features being set:
dg_model
is set tonova
. This addsmodel=nova
to the query parameters of the Deepgram request.dg_tag
is set tosomeTag someOtherTag
. This addstag=someTag&tag=someOtherTag
to the query parameters of the Deepgram request.dg_callback
is set tohttps://example.com/{contact-id}
. As you would expect, this adds acallback
to the query parameters of the Deepgram request--but it also injects the contact ID of the current call. In other words, if a Connect call has contact ID002f61e1-423e-415d-b086-697186514860
, then the transcripts for that call will be POSTed tohttps://example.com/002f61e1-423e-415d-b086-697186514860
. Injecting contact IDs like this is only possible within thedg_callback
contact attribute. This enables you to associate transcripts with calls.
There are a few features in the Deepgram streaming API that you should not attempt to set in the contact flow. These are:
encoding
sample_rate
multichannel
channels
The integrator already knows the
encoding
andsample_rate
, and we lockmultichannel=true
andchannels=2
so that you can receive transcripts for both sides of the call. If you try to manually set any of these values, the session will fail.
Updated about 2 months ago