Build a Voice Agent with Twilio & OpenAI & Deepgram
Build a Voice Agent with Twilio & OpenAI & Deepgram
Build a low-latency phone voice agent that pipes Twilio audio through Deepgram speech-to-text, OpenAI for the LLM, and Deepgram text-to-speech.
Twilio handles the phone call. Deepgram handles speech-to-text and text-to-speech. OpenAI handles the LLM. Together they form a streaming voice agent that answers an inbound call, listens, thinks, and talks back. This guide walks through the working sample server end to end.
Before you begin
This guide assumes basic JavaScript and Node.js knowledge and familiarity with OpenAI, Twilio, and Ngrok.
The full sample lives in the deepgram-twilio-streaming-voice-agent repository.
Get a Deepgram API key
Create a Deepgram account first. Signup is free and includes $200 in free credit.
Create a Deepgram API key and keep it handy. You will export it as an environment variable later.
Get Twilio credentials
This demo uses Twilio Voice to start a phone call that the server records and transcribes. Sign up for a Twilio account, then grab the Account SID and Auth Token from your Twilio Admin Dashboard.
Get OpenAI credentials
The agent uses OpenAI for the LLM. Sign up for an OpenAI account and create an API key.
What you will build
A Node.js server that wires together six streaming components:
- a callable Twilio phone number
- Twilio inbound media stream (caller audio)
- Deepgram streaming speech-to-text
- Streaming OpenAI LLM
- Deepgram streaming text-to-speech
- Twilio outbound media stream (agent audio)
The implementation is a working reference, not a production deployment. Use it as a starting point for your own application logic.
Clone the repository
Set up the server
Read the server.js file in the repository to see the full server-side implementation.
Set environment variables
Export your API keys so the server can authenticate with OpenAI and Deepgram:
Verify they are set:
Install and run
Requires Node v12.1.0 or later.
Set up the demo
Install ngrok
ngrok exposes your local server so Twilio can reach it.
- macOS:
brew install ngrok/ngrok/ngrok - Windows or Linux: follow the ngrok install instructions
Sign up for an ngrok account, copy your authtoken from the ngrok dashboard, and connect the agent:
Buy a Twilio phone number
Use the Twilio CLI or the Twilio Admin Dashboard. The CLI version:
Then purchase a number (replace +123456789 with one from the list above):
Point Twilio at your ngrok URL
Start ngrok in a separate terminal from the one running the server:
ngrok prints a forwarding URL on the Forwarding row. Copy it.
Edit templates/streams.xml and replace <ngrok url> with your ngrok host. Use wss:// and include /streams. For example: wss://yourdomain.ngrok-free.app/streams.
In your Twilio dashboard, open the active phone number. Under the Configure tab, set “A call comes in” to your TwiML URL: https://yourdomain.ngrok-free.app/twiml.
Restarting ngrok generates a new URL. Update the Twilio webhook every time.
Make the call
Dial the Twilio number from any phone, or trigger an outbound call from the CLI (replace +123456789 with your Twilio number, +19876543210 with the phone you want to call, and abcdef.ngrok.io with your ngrok host):