Getting Started with Speech to Text

With Deepgram’s STT API, you can transcribe both pre-recorded files and real-time streams, choose models optimized for different domains, and integrate transcription directly into your apps for use cases like conversational AI, live captions, and agent assist.

Before you start, consider your use case. Deepgram STT offers three main paths:

Streaming audio

Real-time, turn-based transcription for voice agents

Benefits: Model-integrated end-of-turn detection, configurable turn-taking dynamics

Examples: Contact center agents, customer support bots, real-time assistants.

Get started

Realtime transcription for meetings and events

Benefits: Transcripts in real time, larger language availability, can get diarized transcripts

Examples: Captions, live event transcription, monitoring audio feeds.

Get started

Pre-recorded audio

Pre-recorded file transcription

Benefits: Simple implementation, broader language availability, cost efficient

Examples: Transcribing interviews, podcasts, meetings, support calls.

Get started