When using Deepgram's API, you have access to our Keywords and Search features, which are named similarly, but work in very different ways.
Keywords are words that you would like Deepgram to pay more attention to, so it can transcribe those words more accurately. If you send a list of keywords with your request to the Deepgram API, Deepgram can better understand the context of what it is transcribing because it can train itself to better identify those words.
Search also expects you to add a word or list of words to your request to the Deepgram API. But in this case, Deepgram looks for these words in the transcription, returning a response that includes the start time and end time of any matching words it identifies.
|Handles words that you want Deepgram to pay more attention to transcribing successfully. Providing Deepgram with a list of words gives the technology more context, so the ASR can transcribe the specified words more accurately.||Handles words that you want Deepgram to search for and provide a best guess at the start time and end time at which the word was said.|
|Cannot accept multiple-word phrases, such as "no problem", to be boosted as a unit; if you send words with URL-encoding, Deepgram will boost words individually.||Can accept multiple-word phrases, such as "no problem", but you must URL-encode the phrase.|
|Can accept up to 100 words.||Can accept up to 25 words.|
|Accepts multiple instances in the query string (||Accepts multiple instances in the query string (|
|Does not return additional data in the response; just improves the transcription.||Returns an additional data object that contains possible matches for your search terms and identifies an estimated start and end time for each word.|
To really understand when to use Keywords and when to use Search, let's explore some possible scenarios.
Your company employs people in a customer service role, and they are required to say at the start of every phone call, "This call is being recorded." To ensure compliance, you want to check that they are actually saying this phrase.
In this scenario, you can use Search to find phrases that match "This call is being recorded." Deepgram's AI will do its best to find phrases that match and will return a list of possible matches with a confidence rating.
Once you receive your results, you should analyze the data and decide for yourself which confidence rating is more likely to return an accurate match. As you practice and gain experience, you will start to see a pattern forming between confidence numbers and accurate matches.
Your company is participating in litigation, and as part of the eDiscovery phase, you are tasked with producing electronic documents (including audio and video) that are relevant to the case.
In this situation, instead of searching for a single word or phrase, you would search for language that suggests that conversation in an audio or video file is relevant to your court case. So you would use Search to point you to valuable sections of your audio.
Since Deepgram's Search feature returns a confidence level in the response rather than just being a yes/no check, you can analyze its results to identify a combination of words and phrases that might help you find more nuanced relevant language.
You are planning an important demo for the upcoming week, and you want to use Deepgram's speech-to-text API, but you haven't trained a language model yet. The demo is crucial to your business, so you want to give an added boost to Deepgram's transcription technology to improve accuracy for certain words.
In this scenario, you can use Keywords to get a more accurate transcription without training a custom model, which would be a more powerful option. To improve the transcription, you would include a list of keywords that are more difficult for Automatic Speech Recognition (ASR) to recognize (such as jargon, names, and scientific words) in the query parameters of your request to Deepgram.
Using the Keywords feature requires practice and thoughtful implementation. The best approach is to benchmark the transcription results without any keywords, and then add words one by one to notice the effect. Until you get a feel for how the Keywords feature works with your chosen use case models and unique language context, you may find that providing keywords seems to have no effect or even lower the quality of the transcription.
This is because:
- the better Deepgram's off-the-shelf models get at transcribing on their own, the less powerful the Keywords feature becomes because Deepgram gives an increasing confidence level to the transcription itself. If you are using an extremely powerful off-the-shelf model, using keywords might not have an effect.
- the Keywords feature works differently depending on whether the provided words are 'in-vocabulary', meaning they already exist in whatever model you are using, or 'out-of-vocabulary', meaning they do not already exist in the model. (To learn more about in-vocabulary and out-of-vocabulary keywords, see Features: Keywords.) If the keywords you provide are in-vocabulary (regardless of whether you are using an off-the-shelf model or your own trained model), Deepgram will place even more emphasis on them, telling the AI to boost the confidence that those words were spoken. This can affect the quality of your transcript, so adjust your boost values slowly and carefully.
- boost values for keywords are exponential. If you add really high intensifiers to your keywords, you may cause Deepgram's AI to overuse those keywords. You need to find the right balance, which comes with trial and error.
- the Keywords feature cannot boost phrases, so if you are searching for a combination of words (for example, first name + last name), each word will be boosted individually, which may result in one showing up correctly while the other does not.
Some other behaviors that can affect performance of the Keywords feature:
- Sending a very large number of keywords may affect the speed at which Deepgram is able to create a transcription. You may not be concerned about speed if you are batch-processing audio files, but you should pay special attention to speed if you are performing real-time transcription. We recommend limiting your submission to no more than 100 keywords, understanding that 100 is really pushing the limits.
- Providing out-of-vocabulary keywords can cause Deepgram to take longer to return a response.
The Keywords feature is still in beta at this time, so please factor this into your usage. We are actively working to improve this feature.
Some other possible scenarios for the Search feature include:
- Searching captions of real-time streams or pre-recorded videos (that your product provides using Deepgram) to locate a discussed topic or point of interest.
- Searching for competitor or brand names in various types of media recordings, so you can gather information to help strategize.
- Searching for specific phrases that imply dissatisfaction (such as profanity), so you can identify and analyze situations in which customers were unhappy.
Some other possible scenarios for the Keywords feature include:
- Improving a transcription when you haven't yet had the opportunity to train a custom model but know a number of context-unique words that you can identify.
- Improving the real-time transcription of a meeting by making sure names of participants and specialized company terminology that will be used during the meeting are identified.
Updated 6 months ago