Build a Voice Agent with Twilio & OpenAI & Deepgram
Build a Voice Agent with Twilio & OpenAI & Deepgram
Build a low-latency phone voice agent that pipes Twilio audio through Deepgram speech-to-text, OpenAI for the LLM, and Deepgram text-to-speech.
Build a Voice Agent with Twilio & OpenAI & Deepgram
Build a low-latency phone voice agent that pipes Twilio audio through Deepgram speech-to-text, OpenAI for the LLM, and Deepgram text-to-speech.
Twilio handles the phone call. Deepgram handles speech-to-text and text-to-speech. OpenAI handles the LLM. Together they form a streaming voice agent that answers an inbound call, listens, thinks, and talks back. This guide walks through the working sample server end to end.
This guide assumes basic JavaScript and Node.js knowledge and familiarity with OpenAI, Twilio, and Ngrok.
The full sample lives in the deepgram-twilio-streaming-voice-agent repository.
Create a Deepgram account first. Signup is free and includes $200 in free credit.
Create a Deepgram API key and keep it handy. You will export it as an environment variable later.
This demo uses Twilio Voice to start a phone call that the server records and transcribes. Sign up for a Twilio account, then grab the Account SID and Auth Token from your Twilio Admin Dashboard.
The agent uses OpenAI for the LLM. Sign up for an OpenAI account and create an API key.
A Node.js server that wires together six streaming components:
The implementation is a working reference, not a production deployment. Use it as a starting point for your own application logic.
Read the server.js file in the repository to see the full server-side implementation.
Export your API keys so the server can authenticate with OpenAI and Deepgram:
Verify they are set:
Requires Node v12.1.0 or later.
ngrok exposes your local server so Twilio can reach it.
brew install ngrok/ngrok/ngrokSign up for an ngrok account, copy your authtoken from the ngrok dashboard, and connect the agent:
Use the Twilio CLI or the Twilio Admin Dashboard. The CLI version:
Then purchase a number (replace +123456789 with one from the list above):
Start ngrok in a separate terminal from the one running the server:
ngrok prints a forwarding URL on the Forwarding row. Copy it.
Edit templates/streams.xml and replace <ngrok url> with your ngrok host. Use wss:// and include /streams. For example: wss://yourdomain.ngrok-free.app/streams.
In your Twilio dashboard, open the active phone number. Under the Configure tab, set “A call comes in” to your TwiML URL: https://yourdomain.ngrok-free.app/twiml.
Restarting ngrok generates a new URL. Update the Twilio webhook every time.
Dial the Twilio number from any phone, or trigger an outbound call from the CLI (replace +123456789 with your Twilio number, +19876543210 with the phone you want to call, and abcdef.ngrok.io with your ngrok host):