On-premise Twilio Integration
Twilio is a developer platform that provides a set of APIs that customers can use to build digital experiences using capabilities like SMS, WhatsApp, Video, and Twilio Programmable Voice, which lets you make, manage, and route calls to a browser, an app, your phone, or anywhere else you can take a call. And now, you can integrate with Deepgram to get real-time automatic speech recognition for all your Twilio Programmable Voice calls.
Demo: Twilio + Deepgram
To get an idea of how Twilio and Deepgram can work together, ask your Account Executive to take a look at our demo.
Solutions
To help you integrate between Twilio and Deepgram, we provide the following solutions:
-
A Python starter server that offers an accessible solution off of which you can build. In this guide, a conversation streamed to your Twilio number will be forked to our Python script, which will send the audio to Deepgram, and receive and print transcriptions to the screen. In a real implementation, you will likely want to provide a callback to which transcriptions can be sent.
-
A Docker image (
deepgram/twilio-proxy:beta
) that fully integrates with our On-premise installations using the same robust Rust architecture that our other services use. For access to the Docker image, ask your Account Executive.
Python Starter Server
For our Python solution, we provide two scripts, each of which can act as a proxy server to facilitate the exchange of data between Twilio and Deepgram.
-
twilio-proxy-mono.py
: Runs the proxy server for the inbound Twilio track, which represents the audio Twilio receives from the call. To learn more about Twilio tracks, see Twilio’s track documentation. -
twilio-proxy-stereo.py
: Runs the proxy server for both the inbound and outbound Twilio tracks, which represent the audio Twilio received from the call and the audio generated by Twilio to the call. To learn more about Twilio tracks, see Twilio’s track documentation.
You can access these scripts in Deepgram’s Public Code Examples GitHub repo.
Before You Begin
Create a Deepgram Account
Before you can use Deepgram products, you'll need to create a Deepgram account. Signup is free and includes:
$200 in credit, which gives you access to:
- all base models
- pre-recorded and streaming functionality
- all features
Create a Deepgram API Key
To access Deepgram’s API, you'll need to create a Deepgram API Key. Make note of your API Key; you will need it later.
Create a Twilio Account
Before you can use Twilio products, you'll need to create a Twilio account. In addition, if you don't currently own a Twilio phone number with Voice functionality, you'll need to purchase one.
Configure Environment
We provide sample scripts in Python format and assume you have already configured a Python (3.6 or greater) development environment.
Install Dependencies
Real-time streaming uses WebSockets, a communications protocol that enables full-duplex communication, which means that you can stream new audio to Deepgram at the same time the latest transcription results are streaming back to you. Using WebSockets is further eased by the wide variety of third-party client libraries that have been written to support a range of languages and production environments.
For Python, we use websockets.
Additional dependencies for Python include pydub.
Configure the Scripts
Prior to running the scripts, you must replace the authentication with your Deepgram username and password.
Mono
On line 12 of twilio-proxy-mono.py
, replace YOUR_DEEPGRAM_API_KEY
with the Deepgram API key you created earlier in this tutorial:
12 'Authorization': 'Token YOUR_DEEPGRAM_API_KEY'
Stereo
On line 12 of twilio-proxy-stereo.py
, replace YOUR_DEEPGRAM_API_KEY
with the Deepgram API key you created earlier in this tutorial:
12 'Authorization': 'Token YOUR_DEEPGRAM_API_KEY'
Run the Scripts
To run the scripts:
-
Clone the GitHub repo to your local machine.
-
From the command line, navigate to the cloned repository and access the
python
directory by running:cd twilio
-
Run the script using one of the following commands:
Mono
python3 twilio-proxy-mono.py
Stereo
python3 twilio-proxy-stereo.py
Forward Data to Proxy
Finally, you will need to forward data to the proxy scripts. You can do this by configuring Twilio to send WebSockets data to the server running the proxy scripts or by initiating a call between two people and directly forwarding the data to the proxy scripts.
Configure Twilio to Use WebSockets
Configure Twilio to send WebSockets data to the server running the scripts. To do this, see the Start Streaming Audio section of Twilio’s tutorial: Consume a real-time Media Stream using WebSockets, Python, and Flask. In this tutorial, you will use TwiML Bins, a serverless solution that helps you provide Twilio-hosted instructions to your Twilio applications, to begin streaming your call's audio.
When calling your Twilio number, the call will be forwarded to the number you set in your TwiML Bin. The conversation will then be forked to the twilio-proxy-mono
or twilio-proxy-stereo
app, which will send the audio to Deepgram, receive transcriptions, and print the transcriptions to the screen. In a real implementation, you will likely want to provide a callback to which transcriptions can be sent.
Sample TwiML Bin files are as follows:
Mono
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Start>
<Stream url="wss://my-server-address" />
</Start>
<Dial>my-phone-number</Dial>
</Response>
Stereo
For stereo, an additional track
parameter exists.
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Start>
<Stream url="wss://my-server-address" track="both_tracks" />
</Start>
<Dial>my-phone-number</Dial>
</Response>
Send Call Data to Proxy Scripts
Alternatively, you can initiate a call between two people and forward the call data to the Twilio-Deepgram proxy (as seen in the Twilio + Deepgram demo) using the script in twilio/twilio-api-scripts/stream.py
:
# twilio helper library
from twilio.rest import Client
# other imports
import time
import requests
import json
import os
import uuid
# your account sid and auth token from twilio.com/console
account_sid = os.environ['TWILIO_ACCOUNT_SID']
auth_token = os.environ['TWILIO_AUTH_TOKEN']
# the twilio client
client = Client(account_sid, auth_token)
# make the outgoing call
call = client.calls.create(
twiml = '<Response><Start><Stream url="wss://url.to.deepgram.twilio.proxy" track="both_tracks" /></Start><Dial>+11231231234</Dial></Response>', # replace number with person B, replace url
to = '+11231231234', # person A
from_ = '+11231231234' # your twilio number
)
- Be sure to replace
TWILIO_ACCOUNT_SID
andTWILIO_AUTH_TOKEN
with your Twilio account information. - Replace the
url
variable with the URL to the Deepgram-Twilio proxy server, and the Dial number with person B's phone number. - Replace the
to
andfrom_
variables with person A's phone number, and your Twilio voice number, respectively.
Docker Image
We provide the Docker image deepgram/twilio-proxy:beta
, which you can request from your Account Executive.
Before You Begin
Create a Deepgram Account
Before you can use Deepgram products, you'll need to create a Deepgram account. Signup is free and includes:
$200 in credit, which gives you access to:
- all base models
- pre-recorded and streaming functionality
- all features
Create a Twilio Account
Before you can use Twilio products, you'll need to create a Twilio account. In addition, if you don't currently own a Twilio phone number with Voice functionality, you'll need to purchase one.
Run the Docker Image
We recommend running the Twilio-proxy server in a Docker container configured using a Docker Compose file. We provide the following sample Compose files to show you how to do this for a Deepgram API using either our On-premise or Deepgram Hosted deployment models.
On-Premise
To deploy the Twilio proxy server to an On-Premise installation using Docker Compose, you can use the following sample Compose file.
-
This sample references
api.toml
andengine.toml
, which are used to configure an On-Premise Deepgram API and Engine. To learn more, talk to your Account Executive. -
This sample references the
deepgram/actix-ws-echo:beta
Docker image, which is a WebSockets echo server. You can use it to see streaming Deepgram ASR responses, or you can configure a callback URL to send Deepgram ASR responses elsewhere.
version: '2.4'
services:
api:
image: deepgram/onprem-api:1.32.9
volumes:
- '/path/to/api.toml:/api.toml:ro'
command: -vvv serve /api.toml
engine:
image: deepgram/onprem-engine:3.9.1
runtime: nvidia
volumes:
- '/path/to/engine.toml:/engine.toml:ro'
- '/path/to/models:/models:ro'
command: -v serve /engine.toml
proxy:
image: deepgram/twilio-proxy:beta
ports:
- '8080:8080'
environment:
- RUST_LOG=TRACE
- PROXY_URL=0.0.0.0:8080
- STEM_URL=ws://api:8080/v2/listen
- CALLBACK_URL=ws://echo:8080/
command: ''
echo:
image: deepgram/actix-ws-echo:beta
environment:
- ECHO_URL=0.0.0.0:8080
command: ''
The Docker Compose file references the following environment variables:
Environment Variable | Description |
---|---|
RUST_LOG (optional) | Sets the logging verbosity. Can be TRACE , DEBUG , INFO , WARN , or ERROR . |
PROXY_URL | Sets the URL of the twilio-proxy server. Should follow the format 0.0.0.0:8080. |
STEM_URL | Sets the URL of the Deepgram endpoint. |
CALLBACK_URL (optional) | URL to which Deepgram ASR results should be sent. If not specified, Deepgram ASR results are logged by the twilio-proxy server. |
Hosted
To deploy the Twilio proxy server to a Deepgram-Hosted installation using Docker Compose, you can use the following sample Compose file.
Note
This sample references the
deepgram/actix-ws-echo:beta
Docker image, which is a WebSockets echo server. You can use it to see streaming Deepgram ASR responses, or you can configure a callback URL to send Deepgram ASR responses elsewhere.
version: '2.4'
services:
proxy:
image: deepgram/twilio-proxy:beta
ports:
- '8080:8080'
environment:
- RUST_LOG=TRACE
- PROXY_URL=0.0.0.0:8080
- STEM_URL=wss://api.deepgram.com/v1/listen
- STEM_BAUTH=YOUR_DEEPGRAM_API_KEY
- CALLBACK_URL=ws://echo:8080/
- CALLBACK_BAUTH=base64-encoded-callback-username:password
command: ''
echo:
image: deepgram/actix-ws-echo:beta
environment:
- ECHO_URL=0.0.0.0:8080
command: ''
Environment Variable | Description |
---|---|
RUST_LOG (optional) | Sets the logging verbosity. Can be TRACE , DEBUG , INFO , WARN , or ERROR . |
PROXY_URL | Sets the URL of the twilio-proxy server. Should follow the format 0.0.0.0:8080. |
STEM_URL | Sets the URL of the Deepgram endpoint. |
STEM_BAUTH (optional) | Your Deepgram project's API Key. This is the value stored in key . |
CALLBACK_URL (optional) | URL to which Deepgram ASR results should be sent. If not specified, Deepgram ASR results are logged by the twilio-proxy server. |
CALLBACK_BAUTH (optional) | If using a callback server to receive Deepgram ASR results, the base64-encoded value of username:password for that server. |
Forward Data to Proxy
Finally, you will need to forward data to the Twilio-Deepgram proxy. You can do this by configuring Twilio to send WebSockets data to the server running the Twilio-Deepgram proxy or by initiating a call between two people and directly forwarding the data to the Twilio-Deepgram proxy.
Configure Twilio to Use WebSockets
To user the Docker Image, you must configure Twilio to forward data to the server serving the Rust program. To do this, see the Start Streaming Audio section of Twilio’s tutorial: Consume a real-time Media Stream using WebSockets, Python, and Flask. In this tutorial, you will use TwiML Bins, a serverless solution that helps you provide Twilio-hosted instructions to your Twilio applications, to begin streaming your call's audio.
When calling your Twilio number, the call will be forwarded to the number you set in your TwiML Bin. The conversation will then be forked to the Twilio-Deepgram proxy app, which will send the audio to Deepgram, receive transcriptions, and print the transcriptions to the screen. In a real implementation, you will likely want to provide a callback to which transcriptions can be sent.
Sample TwiML Bin files are as follows:
Mono
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Start>
<Stream url="wss://my-server-address" />
</Start>
<Dial>my-phone-number</Dial>
</Response>
Stereo
For stereo, an additional track
parameter exists.
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Start>
<Stream url="wss://my-server-address" track="both_tracks" />
</Start>
<Dial>my-phone-number</Dial>
</Response>
Send Call Data to Twilio-Deepgram Proxy
Alternatively, you can initiate a call between two people and forward the call data to the Twilio-Deepgram proxy (as seen in the Twilio + Deepgram demo) using the script in twilio/twilio-api-scripts/stream.py
:
# twilio helper library
from twilio.rest import Client
# other imports
import time
import requests
import json
import os
import uuid
# your account sid and auth token from twilio.com/console
account_sid = os.environ['TWILIO_ACCOUNT_SID']
auth_token = os.environ['TWILIO_AUTH_TOKEN']
# the twilio client
client = Client(account_sid, auth_token)
# make the outgoing call
call = client.calls.create(
twiml = '<Response><Start><Stream url="wss://url.to.deepgram.twilio.proxy" track="both_tracks" /></Start><Dial>+11231231234</Dial></Response>', # replace number with person B, replace url
to = '+11231231234', # person A
from_ = '+11231231234' # your twilio number
)
- Be sure to replace
TWILIO_ACCOUNT_SID
andTWILIO_AUTH_TOKEN
with your Twilio account information. - Replace the
url
variable with the URL to the Deepgram-Twilio proxy server, and the Dial number with person B's phone number. - Replace the
to
andfrom_
variables with person A's phone number, and your Twilio voice number, respectively.
Updated 18 days ago