🚀 Introducing Flux, the first conversational speech recognition model built for voice agents. (Learn more!)🚀
Using Deepgram’s fully hosted Whisper Cloud instead of running your own version provides many benefits. Some of these benefits include:
Deepgram hosts and maintains these Whisper models; they aren’t hosted or run by Open AI. Therefore, data sent through API requests for our Whisper models will not be sent to OpenAI.
Live streaming is not available with Deepgram Whisper Cloud. If you would like to transcribe live streamed audio, we recommend using our Nova-3 model. This guide can help you get started.
In this guide, you’ll learn how to transcribe pre-recorded audio using Deepgram’s hosted Whisper API.
Before you can use Deepgram, you’ll need to create a Deepgram account. Signup is free and includes $200 in free credit and access to all of Deepgram’s features!
Before you start, you’ll need to follow the steps in the Make Your First API Request guide to obtain a Deepgram API key, and configure your environment if you are choosing to use a Deepgram SDK.
Transcribe a remote file using Deepgram’s Whisper API with the following request.
If you would like to use a Deepgram SDK to make the request, follow the steps in the Pre-Recorded speech-to-text guide, but change the model to whisper
.
Replace YOUR_DEEPGRAM_API_KEY
with your Deepgram API Key.
To enable Deepgram’s Whisper API, add a model parameter in the query string and set it to model=whisper
To enable a specific size of the Whisper model, set the model parameter to model=whisper-size
.
If model=whisper
is supplied and no model size specified, the model size will default to model=whisper-medium
.
These are the Deepgram Whisper Cloud models available:
model=whisper
(defaults to whisper-medium)model=whisper-tiny
model=whisper-base
model=whisper-small
model=whisper-medium
model=whisper-large
(defaults to large-v2)Deepgram Whisper Cloud supports language detection, which means just by setting detect_language=true
, your audio will be transcribed in the detected language.
Officially supported languages include Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh. (Source: “Whisper API FAQ”)
Languages supported by whisper include: en
, zh
, de
, es
, ru
, ko
, fr
, ja
, pt
, tr
, pl
, ca
, nl
, ar
, sv
, it
, id
, hi
, fi
, vi
, he
, uk
, el
, ms
, cs
, ro
, da
, hu
, ta
, no
, th
, ur
, hr
, bg
, lt
, la
, mi
, ml
, cy
, sk
, te
, fa
, lv
, bn
, sr
, az
, sl
, kn
, et
, mk
, br
, eu
, is
, hy
, ne
, mn
, bs
, kk
, sq
, sw
, gl
, mr
, pa
, si
, km
, sn
, yo
, so
, af
, oc
, ka
, be
, tg
, sd
, gu
, am
, yi
, lo
, uz
, fo
, ht
, ps
, tk
, nn
, mt
, sa
, lb
, my
, bo
, tl
, mg
, as
, tt
, haw
, ln
, ha
, ba
, jw
, su
.
If you would like to transcribe audio in a specific language, you can do so by setting the language parameter in the query string. You can pass in any language code supported by Whisper through our language
parameter. To learn more about languages, see Language.
This is a list of Deepgram Features and their current status for use with Deepgram Whisper Cloud:
Deepgram Whisper Cloud is a fully managed API that gives you access to Deepgram’s version of OpenAI’s Whisper model.