For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Ask AIPlaygroundLoginFree API Key
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
HomeAPI ReferenceVoice AgentSpeech-to-TextText-to-SpeechIntelligenceSelf-Hosted Deployments
    • Getting Started with Speech to Text
  • Pre-Recorded Audio
    • Getting Started
    • Feature Overview
    • Template Apps
  • Streaming Audio
    • Compare Flux to Nova-3
  • Models and Languages
    • Models & Languages Overview
    • Languages Support
    • Language Detection
    • Multilingual Codeswitching
    • Model Options
    • Version
  • Formatting
    • Speaker Diarization
    • Dictation
    • Filler Words
    • Measurements
    • Numerals
    • Paragraphs
    • Profanity Filtering
    • Punctuation
    • Redaction
    • Smart Formatting
    • Supported Entity Types
    • Utterances
    • Utterance Split
  • Custom Vocabulary
    • Find and Replace
    • Keyterm Prompting
    • Keywords
    • Search
  • Media Input Settings
    • Channels
    • Encoding
    • Multichannel
    • Sample Rate
  • Results Processing
    • Understanding Word Confidence Scores
    • STT Callback
    • STT Tagging
    • Extra Metadata
  • Migrating
    • Migrating From Amazon Web Services (AWS) Transcribe to Deepgram
    • Migrating From Google Speech-to-Text (STT) to Deepgram
    • Migrating From OpenAI Whisper to Deepgram
    • Migrating from AssemblyAI Speech-to-Text to Deepgram
LogoLogo
Ask AIPlaygroundLoginFree API Key
On this page
  • Enable Feature
Media Input Settings

Encoding

Encoding allows you to specify the expected encoding of your submitted audio.
Was this page helpful?
Previous

Multichannel

Multichannel transcribes each channel in submitted audio independently.
Next
Built with

encoding string

Pre-recorded Streaming:NovaStreaming:Flux All available languages

Encoding is required when raw, headerless audio packets are sent to the streaming service. If containerized audio packets are sent to the streaming service, this feature should not be used.

If you are using the Encoding feature, the Sample Rate feature is also required.

Enable Feature

To enable Encoding, when you call Deepgram’s API, add an encoding parameter in the query string and set it to the audio coding algorithm of your submitted audio:

encoding=OPTION

cURL
$curl \
> --request POST \
> --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
> --header 'Content-Type: audio/mp3' \
> --data-binary @youraudio.mp3 \
> --url 'https://api.deepgram.com/v1/listen?sample_rate=8000&encoding=linear16'

Deepgram supports the following audio coding algorithms:

Flux supports linear16, linear32, mulaw, alaw, opus, and ogg-opus for non-containerized/raw audio. Flux also supports containerized formats: linear16 in WAV containers, opus in Ogg containers, and opus in WebM containers (omit the encoding parameter for containerized audio).

  • linear16: 16-bit, little endian, signed PCM WAV data
  • linear32: 32-bit, little endian, floating-point PCM WAV data
  • flac: Free Lossless Audio Codec (FLAC) encoded data
  • alaw: A-law encoded WAV data
  • mulaw: Mu-law encoded WAV data
  • amr-nb: Adaptive Multi-Rate (AMR) narrowband codec
  • amr-wb: Adaptive Multi-Rate (AMR) wideband codec
  • opus: The Opus audio codec
  • ogg-opus: The Opus audio codec encapsulated in the Ogg container format
  • speex: An open-source, speech-specific audio codec
  • g729: G729 low-bandwidth (required for both raw and containerized audio)