Text-to-Speech REST

An overview of the Deepgram Go SDK and Deepgram text-to-speech.

Installing the SDK

To begin using Deepgram’s Text-to-Speech functionality, you need to install the Deepgram Go SDK in your existing project. You can do this using the following command:

Bash
$# Install the Deepgram Go SDK
># https://github.com/deepgram/deepgram-gpo-sdk
>
>go get github.com/deepgram/deepgram-go-sdk

Make a Deepgram Text-to-Speech Request

Go
1package main
2
3import (
4 "context"
5 "encoding/json"
6 "fmt"
7 "os"
8
9 prettyjson "github.com/hokaccha/go-prettyjson"
10
11 api "github.com/deepgram/deepgram-go-sdk/pkg/api/speak/v1/rest"
12 interfaces "github.com/deepgram/deepgram-go-sdk/pkg/client/interfaces"
13 client "github.com/deepgram/deepgram-go-sdk/pkg/client/speak"
14)
15
16const (
17 textToSpeech string = "Hello, World!"
18 filePath string = "./test.mp3"
19)
20
21func main() {
22 // init library
23 client.InitWithDefault()
24
25 // Go context
26 ctx := context.Background()
27
28 // set the Transcription options
29 options := &interfaces.SpeakOptions{
30 Model: "aura-asteria-en",
31 }
32
33 // create a Deepgram client
34 c := client.NewRESTWithDefaults()
35 dg := api.New(c)
36
37 // send/process file to Deepgram
38 res, err := dg.ToSave(ctx, filePath, textToSpeech, options)
39 if err != nil {
40 fmt.Printf("FromStream failed. Err: %v\n", err)
41 os.Exit(1)
42 }
43
44 data, err := json.Marshal(res)
45 if err != nil {
46 fmt.Printf("json.Marshal failed. Err: %v\n", err)
47 os.Exit(1)
48 }
49
50 // make the JSON pretty
51 prettyJSON, err := prettyjson.Format(data)
52 if err != nil {
53 fmt.Printf("prettyjson.Marshal failed. Err: %v\n", err)
54 os.Exit(1)
55 }
56 fmt.Printf("\n\nResult:\n%s\n\n", prettyJSON)
57}

Audio Output Streaming

Deepgram’s TTS API allows you to start playing the audio as soon as the first byte is received. This section provides examples to help you stream the audio output efficiently.

Single Text Source Payload

The following example demonstrates how to stream the audio as soon as the first byte arrives for a single text source:

Go
1package main
2
3import (
4 "context"
5 "encoding/json"
6 "fmt"
7 "os"
8
9 prettyjson "github.com/hokaccha/go-prettyjson"
10
11 speak "github.com/deepgram/deepgram-go-sdk/pkg/api/speak/v1"
12 interfaces "github.com/deepgram/deepgram-go-sdk/pkg/client/interfaces"
13 client "github.com/deepgram/deepgram-go-sdk/pkg/client/speak"
14)
15
16const (
17 textToSpeech string = "Hello, World!"
18 filePath string = "./test.mp3"
19)
20
21func main() {
22 // STEP 1: Initialize the library
23 client.InitWithDefault()
24
25 // Go context
26 ctx := context.Background()
27
28 // STEP 2: Create a Deepgram client.
29 // By default, the DEEPGRAM_API_KEY environment variable will be used for the API Key
30 c := client.NewWithDefaults()
31 dg := speak.New(c)
32
33 // STEP 3: Configure the options (such as model choice, audio configuration, etc.)
34 options := &interfaces.SpeakOptions{
35 Model: "aura-asteria-en",
36 }
37
38 // STEP 4: send/process the desired text to Deepgram to convert to Speech
39 res, err := dg.ToSave(ctx, filePath, textToSpeech, options)
40 if err != nil {
41 fmt.Printf("FromStream failed. Err: %v\n", err)
42 os.Exit(1)
43 }
44
45 // STEP 5: Your result struct/JSON
46 data, err := json.Marshal(res)
47 if err != nil {
48 fmt.Printf("json.Marshal failed. Err: %v\n", err)
49 os.Exit(1)
50 }
51
52 // make the JSON pretty
53 prettyJSON, err := prettyjson.Format(data)
54 if err != nil {
55 fmt.Printf("prettyjson.Marshal failed. Err: %v\n", err)
56 os.Exit(1)
57 }
58 fmt.Printf("\n\nResult:\n%s\n\n", prettyJSON)
59}

Chunk Text Source Payload

This example shows how to chunk the text source by sentence boundaries and stream the audio for each chunk consecutively:

Go
1package main
2
3import (
4 "context"
5 "encoding/json"
6 "fmt"
7 "os"
8
9 prettyjson "github.com/hokaccha/go-prettyjson"
10
11 speak "github.com/deepgram/deepgram-go-sdk/pkg/api/speak/v1"
12 interfaces "github.com/deepgram/deepgram-go-sdk/pkg/client/interfaces"
13 client "github.com/deepgram/deepgram-go-sdk/pkg/client/speak"
14)
15
16const (
17 textToSpeech string = "Hello, World!"
18 filePath string = "./test.mp3"
19)
20
21func main() {
22 // STEP 1: Initialize the library
23 client.InitWithDefault()
24
25 // Go context
26 ctx := context.Background()
27
28 // STEP 2: Create a Deepgram client.
29 // By default, the DEEPGRAM_API_KEY environment variable will be used for the API Key
30 c := client.NewWithDefaults()
31 dg := speak.New(c)
32
33 // STEP 3: Configure the options (such as model choice, audio configuration, etc.)
34 options := &interfaces.SpeakOptions{
35 Model: "aura-asteria-en",
36 }
37
38 // STEP 4: send/process the desired text to Deepgram to convert to Speech
39 res, err := dg.ToSave(ctx, filePath, textToSpeech, options)
40 if err != nil {
41 fmt.Printf("FromStream failed. Err: %v\n", err)
42 os.Exit(1)
43 }
44
45 // STEP 5: Your result struct/JSON
46 data, err := json.Marshal(res)
47 if err != nil {
48 fmt.Printf("json.Marshal failed. Err: %v\n", err)
49 os.Exit(1)
50 }
51
52 // make the JSON pretty
53 prettyJSON, err := prettyjson.Format(data)
54 if err != nil {
55 fmt.Printf("prettyjson.Marshal failed. Err: %v\n", err)
56 os.Exit(1)
57 }
58 fmt.Printf("\n\nResult:\n%s\n\n", prettyJSON)
59}

Where to Find Additional Examples

The SDK repository has a good collection of text-to-speech examples. You can find the links to the examples in the README.

Each example will attempt to provide different options on how you might generate a text-to-speech audio.

Built with