Getting Started with Speech to Text

With Deepgramโ€™s STT API, you can transcribe both pre-recorded files and real-time streams, choose models optimized for different domains, and integrate transcription directly into your apps for use cases like conversational AI, live captions, and agent assist.

Before you start, consider your use case. Deepgram STT offers three main paths:

Streaming audio

Real-time, turn-based transcription for voice agents

Benefits: Model-integrated end-of-turn detection, configurable turn-taking dynamics


Examples: Contact center agents, customer support bots, real-time assistants.


Get started


Currently available for English. For other languages, please use our general use Streaming API.

Realtime transcription for meetings and events

Benefits: Transcripts in real time, larger language availability, can get diarized transcripts


Examples: Captions, live event transcription, monitoring audio feeds.


Get started

Pre-recorded audio

Pre-recorded file transcription

Benefits: Simple implementation, broader language availability, cost efficient


Examples: Transcribing interviews, podcasts, meetings, support calls.


Get started