Before you start streaming audio to Deepgram, it’s important that you understand whether your audio is containerized or raw, so you can correctly form your API request.
The difference between containerized and raw audio relates to how much information about the audio is included within the data:
If you’re streaming raw audio to Deepgram, you must provide the encoding and sample rate of your audio stream in your request. Otherwise, Deepgram will be unable to decode the audio and will fail to return a transcript.
An example of a Deepgram API request to stream raw audio:
To see a list of raw audio encodings that Deepgram supports, check out our Encoding documentation.
If you’re streaming containerized audio to Deepgram, you should not set the encoding and sample rate of your audio stream. Instead, Deepgram will read the container’s header and get the correct information for your stream automatically.
An example of a Deepgram API request to stream containerized audio:
Deepgram supports over 100 different audio formats and encodings. You can see some of the most popular ones at Supported Audio Format.
If you’re not sure whether your audio is raw or containerized, you can identify audio format in a few different ways.
Start by checking any available documentation for your audio source. Often, it will provide details related to audio format. Specifically, check for any mentions of encodings like Opus, Vorbis, PCM, mu-law, A-law, s16, or linear16.
If your audio source is a web API stream, in many cases it will already be containerized. For example, the audio may be raw Opus audio wrapped in an Ogg container or raw PCM audio wrapped in a WAV container.
If you’re still not sure whether or not your audio is containerized, you can write an audio stream to disk and try listening to it with a program like VLC. If your audio is containerized, VLC will be able to play it back without any additional configuration.
Alternatively, you can use ffprobe (part of the ffmpeg package, which is a cross-platform solution that records, converts, and streams audio and video) to gather information from the audio stream and detect the audio format of a file.
To use ffprobe, from a terminal, run:
The last line of the output from this command will include any data ffprobe is able to determine about the file’s audio format.
When using raw audio, make sure to set the encoding and the sample rate. Both parameters are required for Deepgram to be able to decode your stream.