Streaming audio from Deepgram Aura Text-to-Speech (TTS) into an ongoing Twilio phone call requires the use of the Twilio streaming API.
Before you can use Deepgram, you’ll need to create a Deepgram account. Signup is free and includes $200 in free credit and access to all of Deepgram’s features!
Before you start, you’ll need to follow the steps in the Make Your First API Request guide to obtain a Deepgram API key, and configure your environment if you are choosing to use a Deepgram SDK.
For the complete code used in this guide, please check out this repository.
You will need:
First, you will need to set up a TwiML Bin. You can refer to the docs on how to do that in the Twilio Console.
Deepgram Aura TTS is not available via the Twilio<Say>verb. Instead you will use a URL.
You should replace the url with wherever you decide to deploy the server we are about to create and ensure/twilio is at the end of the url.
In the TwiML Bin example above, ngrok is used to expose the server running locally.
ngrok is recommended for quick development and testing but shouldn’t be used for production instances.
To use ngrok see their documentation.
Be sure to set the port correctly to align with the server code provided by running this command when you start the ngrok server.
If you restart your ngrok server, your URL will change, which will require you to update your TwiML Bin
Your TwiML Bin must then be connected to one of your Twilio phone numbers so that it gets executed whenever someone calls that number. If you need to set up a phone number and connect it to your TwiML Bin, refer to the Twilio Docs.
In your TwiML Bin The <Connect> verb is required for bi-directional communication, i.e. in order to send audio from Aura TTS to Twilio, you must use this verb.
Copy the twilio.pycode from the repository as we will use this in the steps below and save this code locally as with a file name of twilio.py.
At this point you’ll want to start up a virtual environment for Python. Please refer to documentation for how to do that based on your personal Python preferences.
Depending on your situation you may also need to install specific packages used in this code.
If your TwiML Bin is setup correctly, you can now navigate to this files location in your terminal and run the server with the following command:
OR
You can then start making calls to the phone number your TwiML Bin is using. Without any further modifications, you should hear Deepgram Aura say simply: “Hello, how are you today?”
Let’s dive into the code used in the twilio.py file.
First, we have some import statements:
asyncio and websockets to build an asynchronous websocket server.base64 to handle encoding audio from Aura to pass data to Twilio.json to deal with parsing text messages from Twilio .requests to make HTTP requests to Deepgram’s Aura/TTS endpoint.Next we have:
streamsid_queue to pass the stream sid from the twilio_receiver task to the twilio_sender task.sid to ensure that audio from Deepgram Aura is routed correctly to the corresponding phone call.The twilio_receiver task is defined next:
Next we have the twilio_sender task:
Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.
Next we have:
sid.Additionally, if your application requires the bot to stop speaking at any point, you can do that simply by sending a “clear” message to Twilio.
To close out our websocket handler, we run these two asynchronous tasks with asyncio:
Finally, for some scaffolding to spin up the server and pointing requests to get handled by the above function, we have:
To learn more about sending Twilio phone call audio to Deepgram for Speech-to-Text (STT) see the following guide.
What’s Next