Model Improvement Partnership Program

In the rapidly advancing field of AI, model improvement partnerships are crucial for the frequent development and continuous improvement of increasingly powerful models that drive intelligent systems. The Deepgram Model Improvement Partnership Program provides transparency and definition for how customer data is handled, stored, and utilized by Deepgram, as well as specifies the many benefits participants enjoy. Chief among these is the opportunity for our customers to shape the future of voice AI. These customers gain early and regular access to more accurate models that perform well for their specific use cases through inclusion of relevant real-world data during the model training process.

At Deepgram, we take our customers’ data privacy concerns seriously, which is why we have implemented robust data security policies and flexible data retention options that allow our customers to strike the right balance for their individual needs.

How do we improve our models?

Deepgram utilizes end-to-end deep learning to develop all of our voice AI models. These models are built through an iterative process that learns the inherent relationships in the conversational audio data used for training. This involves hundreds of thousands of hours of conversational data broadly spanning a given language’s vocabulary, as well as inclusion of a wide variety of speaker groups across a large number of dimensions including age, sex, accents, background noise, acoustic environments, etc.

Our in-house DataOps team employs state-of-the-art techniques to curate high quality training data sets and ensure proper balance across the dimensions listed above. Both overrepresentation and underrepresentation of different sample types can have adverse effects on model accuracy. By incorporating some of the data we collect through the Deepgram Model Improvement Partnership Program during training, we are able to produce high quality, in-distribution training data sets that lead to robust model performance both generally and for the specific use cases of interest for our customers.

For speech-to-text, this results in more accurate models that work better for you and everyone speaking your language through improved recognition of the complex and nuanced aspects of real-world speech (e.g. accents, regional dialects, jargon, slang phrases, differences in sentence structure across different languages, etc.). For text-to-speech, this results in more natural models that better portray your brand through improved pronunciation, expressiveness, and emotion in everyday interactions.

After training, a deep learning model for voice AI is essentially a giant mathematical equation that approximates all of the inherent relationships and underlying concepts that comprise human speech (e.g. “‘I’ before ‘E’ except after ‘C’ or when sounded as ‘A’ as in ‘neighbor’ or ‘weigh’”). And the magic of deep learning is that it does this by learning these concepts implicitly from the training data itself instead of being explicitly programmed by humans to do so. Importantly, the model has no rote memory or storage for any of the data used to train it, meaning there is no risk of any data leakage when the model is used in production.

How do we handle your data and ensure security and privacy?

Deepgram stores fractional increments of data for the continued improvement of our voice AI models and to provide enhanced customer support when needed. The only data we will store and use in future model training is the data that is contractually included through participation in the Deepgram Model Improvement Partnership Program. We will never redistribute data to 3rd parties without our customers’ permission. Your data will never be used to market our services or to create advertising profiles.

Deepgram’s infrastructure, policies, and procedures are designed to meet industry-standard compliance and regulatory frameworks, including SOC-2 Type 2, HIPAA, PCI DSS, GDPR, CCPA, and all applicable local government and legal requirements. MFA, RBAC, and VPNs are used to regulate and secure all employee access to data systems. All data is encrypted in-flight and at-rest with industry-standard encryption, including TLS 1.3 and AES-256.

Why Participate?

Participation in this program is voluntary and includes a number of valuable benefits.

Increased accuracy of our voice AI models for your domain and use case with more frequent, higher impact releases of next-gen models that continue to get better and better.
Discounted pricing for program participants that yields significant savings.
Better technical support with faster root cause analysis and time to resolution.
Preferential placement on early access wait lists for future voice AI models, features, and functionality.
Accelerated custom model training timelines for individual customers in need of additional accuracy.
Reduced model drift. Language is fluid and constantly changing, with new jargon and slang popping up in daily conversation over time. Our model improvement partnership program ensures our models will evolve along with your customers’ speech patterns.
Support for Responsible AI by mitigating model bias and ensuring sufficient representation of underrepresented speaker groups based on age, sex, accents, etc. in our training data sets.

Need more help?

Have additional questions? Get in touch.

Want to opt out?

Add mip_opt_out=true as a query parameter of all API requests that you want to be excluded from the Model Improvement Program. By opting out of the Model Improvement Program, customers on Pay as you Go, Growth, or Enterprise plans will forego their 50% discount on the rates listed in your signed contract or on deepgram.com/pricing. Data from opted-out requests is retained only for the duration necessary to process the request.

Speech-to-Text Examples

Here are some examples of opting out for Speech-to-Text requests.

The SDK examples below use a custom add on parameter to set mip_opt_out=true. To learn more about using custom add on parameters with our SDKs refer to the Documentation on using custom add on Parameters.

Pre-recorded Audio

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

$ curl \
>   --request POST \
>   --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
>   --header 'Content-Type: audio/wav' \
>   --data-binary @youraudio.wav \
>   --url 'https://api.deepgram.com/v1/listen?mip_opt_out=true'

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

Streaming Audio

1 // Install the SDK: npm -i @deepgram/sdk
2 
3 import { createClient} from "@deepgram/sdk";
4 
5 const live = async () => {
6 const url = "http://stream.live.vc.bbcmedia.co.uk/bbc_world_service";
7 
8 const deepgram = createClient(process.env.DEEPGRAM_API_KEY);
9 
10 const connection = deepgram.listen.live({
11   model: "nova-3",
12   // Custom option to opt out of Model Improvement Program
13   mip_opt_out: true,
14 });

Text-to-Speech Examples

Here are some examples of opting out for Text-to-Speech requests.

Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.

Rest API

1   curl --request POST \
2     --url 'https://api.deepgram.com/v1/speak?model=aura-2-thalia-en&mip_opt_out=true' \
3     --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
4     --header 'Content-Type: application/json' \
5     --data '{"text": "Hello, how can I help you today?"}' \
6     --output mip_opt_out.wav

Streaming API

1 // Install the SDK: npm -i @deepgram/sdk
2 
3 const fs = require("fs");
4 const { createClient, LiveTTSEvents } = require("../../dist/main/index");
5 
6 // Add a wav audio container header to the file if you want to play the audio
7 // using the AudioContext or media player like VLC, Media Player, or Apple Music
8 // Without this header in the Chrome browser case, the audio will not play.
9 // prettier-ignore
10 const wavHeader = [
11 0x52, 0x49, 0x46, 0x46, // "RIFF"
12 0x00, 0x00, 0x00, 0x00, // Placeholder for file size
13 0x57, 0x41, 0x56, 0x45, // "WAVE"
14 0x66, 0x6D, 0x74, 0x20, // "fmt "
15 0x10, 0x00, 0x00, 0x00, // Chunk size (16)
16 0x01, 0x00,             // Audio format (1 for PCM)
17 0x01, 0x00,             // Number of channels (1)
18 0x80, 0xBB, 0x00, 0x00, // Sample rate (48000)
19 0x00, 0xEE, 0x02, 0x00, // Byte rate (48000 * 2)
20 0x02, 0x00,             // Block align (2)
21 0x10, 0x00,             // Bits per sample (16)
22 0x64, 0x61, 0x74, 0x61, // "data"
23 0x00, 0x00, 0x00, 0x00  // Placeholder for data size
24 ];
25 
26 const live = async () => {
27 const text = "Hello, how can I help you today?";
28 
29 const deepgram = createClient(process.env.DEEPGRAM_API_KEY);
30 
31 const dgConnection = deepgram.speak.live({
32   model: "aura-2-thalia-en",
33   encoding: "linear16",
34   sample_rate: 48000,
35   // Custom option to opt out of Model Improvement Program
36   mip_opt_out: true,
37 });
38 
39 let audioBuffer = Buffer.from(wavHeader);
40 
41 dgConnection.on(LiveTTSEvents.Open, () => {
42   console.log("Connection opened");
43 
44   // Send text data for TTS synthesis
45   dgConnection.sendText(text);
46 
47   // Send Flush message to the server after sending the text
48   dgConnection.flush();
49 
50   dgConnection.on(LiveTTSEvents.Close, () => {
51     console.log("Connection closed");
52   });
53 
54   dgConnection.on(LiveTTSEvents.Metadata, (data) => {
55     console.dir(data, { depth: null });
56   });
57 
58   dgConnection.on(LiveTTSEvents.Audio, (data) => {
59     console.log("Deepgram audio data received");
60     // Concatenate the audio chunks into a single buffer
61     const buffer = Buffer.from(data);
62     audioBuffer = Buffer.concat([audioBuffer, buffer]);
63   });
64 
65   dgConnection.on(LiveTTSEvents.Flushed, () => {
66     console.log("Deepgram Flushed");
67     // Write the buffered audio data to a file when the flush event is received
68     writeFile();
69   });
70 
71   dgConnection.on(LiveTTSEvents.Error, (err) => {
72     console.error(err);
73   });
74 });
75 
76 const writeFile = () => {
77   if (audioBuffer.length > 0) {
78     fs.writeFile("output.wav", audioBuffer, (err) => {
79       if (err) {
80         console.error("Error writing audio file:", err);
81       } else {
82         console.log("Audio file saved as output.wav");
83       }
84     });
85     audioBuffer = Buffer.from(wavHeader); // Reset buffer after writing
86   }
87 };
88 };
89 
90 live();

Voice Agent Example

Here is an example of opting out for Voice Agent requests using the Settings message.

JSON

1 {
2 "type": "Settings",
3 "mip_opt_out": true,
4 "audio": {
5   "input": {
6     "encoding": "linear16",
7     "sample_rate": 24000
8   },
9   "output": {
10     "encoding": "mp3",
11     "sample_rate": 24000,
12     "bitrate": 48000,
13     "container": "none"
14   }
15 },
16 "agent": {
17   "language": "en",
18   "listen": {
19     "provider": {
20       "type": "deepgram",
21       "model": "nova-3",
22       "keyterms": ["hello", "goodbye"]
23     }
24   },
25   "think": {
26     "provider": {
27       "type": "open_ai",
28       "model": "gpt-4o-mini",
29       "temperature": 0.7
30     }
31   },
32   "speak": {
33     "provider": {
34       "type": "deepgram",
35       "model": "aura-2-thalia-en"
36     }
37   },
38   "greeting": "Hello! How can I help you today?"
39 },
40 }

Viewing Opt Out Requests in Logs

Within the Deepgram Console, you can view the opted out requests in the Usage > Logs tab. In your logs you will see the feature mip_opt_out as true.

You can also use the GET a Project Request endpoint to view the opted out requests.