Text to Speech REST

An overview of the Deepgram .NET SDK and Deepgram text-to-speech.

Installing the SDK

To begin using Deepgram's Text-to-Speech functionality, you need to install the Deepgram .NET SDK in your existing project. You can do this using the following command:

# Install the Deepgram .NET SDK
# https://github.com/deepgram/deepgram-dotnet-sdk

dotnet add package Deepgram

# Optionally, install the Deepgram Microphone package
dotnet add package Deepgram.Microphone

Make a Deepgram Text-to-Speech Request

using Deepgram.Models.Speak.v1.REST;

namespace SampleApp
{
    class Program
    {
        static async Task Main(string[] args)
        {
            // Initialize Library with default logging
            // Normal logging is "Info" level
            Library.Initialize();

            // use the client factory with a API Key set with the "DEEPGRAM_API_KEY" environment variable
            var deepgramClient = ClientFactory.CreateSpeakRESTClient();

            var response = await deepgramClient.ToFile(
                new TextSource("Hello World!"),
                "test.mp3",
                new SpeakSchema()
                {
                    Model = "aura-asteria-en",
                });

            //Console.WriteLine(response);
            Console.WriteLine(response);
            Console.ReadKey();

            // Teardown Library
            Library.Terminate();
        }
    }
}

Audio Output Streaming

Deepgram's TTS API allows you to start playing the audio as soon as the first byte is received. This section provides examples to help you stream the audio output efficiently.

Single Text Source Payload

using Deepgram.Models.Speak.v1.REST;

namespace SampleApp
{
    class Program
    {
        static async Task Main(string[] args)
        {
            // Initialize Library with default logging
            // Normal logging is "Info" level
            Library.Initialize();

            // use the client factory with a API Key set with the "DEEPGRAM_API_KEY" environment variable
            var deepgramClient = ClientFactory.CreateSpeakRESTClient();

            // STEP 3: send/process the desired text to Deepgram to convert to Speech
            var response = await deepgramClient.Stream(
                new TextSource("Hello World!"),
                new SpeakSchema()
                {
                    Model = "aura-asteria-en",
                });

            await foreach (var audioSegment in response.AudioStream)
            {
                // Process the audio segment received
            }

            // Teardown Library
            Library.Terminate();
        }
    }
}

Chunk Text Source Payload

using Deepgram.Models.Speak.v1.REST;

namespace SampleApp
{
    class Program
    {
        static async Task Main(string[] args)
        {
            // Initialize Library with default logging
            // Normal logging is "Info" level
            Library.Initialize();

            // use the client factory with a API Key set with the "DEEPGRAM_API_KEY" environment variable
            var deepgramClient = ClientFactory.CreateSpeakRESTClient();

            var inputText = "Your long text goes here...";

          	// STEP 3: process the desired text in segments
            // Send each seqment to Deepgram to convert to Speech
            var segments = SegmentTextBySentence(inputText);
          
            foreach (var segment in segments)
            {
                var response = await deepgramClient.Stream(
                    new TextSource(segment),
                    new SpeakSchema()
                    {
                        Model = "aura-asteria-en",
                    });

                await foreach (var audioSegment in response.AudioStream)
                {
                    // Process the audio segment received
                }
            }

            // Teardown Library
            Library.Terminate();
        }

        static string[] SegmentTextBySentence(string text)
        {
            return Regex.Matches(text, @"[^.!?]+[.!?]")
                        .Cast()
                        .Select(m => m.Value)
                        .ToArray();
        }
    }
}

Where To Find Additional Examples

The SDK repository has a good collection text-to-speech examples. You can find links to them in the README.