Formatting Text for Aura-2 | Deepgram's Docs

Aura-2 is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.

This guide provides essential formatting techniques to optimize text for Aura-2 text-to-speech conversion. Following these guidelines will produce more natural-sounding speech output with appropriate pacing, intonation, and emphasis.

Note for LLM-Generated Text

If you are using a Large Language Model (LLM) to generate input text for Aura-2, you can prompt the LLM to provide conversational responses. For example, instruct the LLM to “respond in a natural, conversational tone with appropriate punctuation for text-to-speech” to get output that will sound more natural when processed by Aura-2.

Core Principles

Principle	Do This ✓	Not This ✗
End sentences with periods	Hello. One moment. I’m looking up your records.	Hello One Moment I’m looking up your records
Use question marks for questions	Would you like to add a drink for $1 more?	Would you like to add a drink for $1 more
Add exclamation points for enthusiasm	Thanks for contacting our support team!	Thanks for contacting our support team
Use commas for natural pauses	You can reach us by phone, chat, or email.	You can reach us by phone chat or email
Put command words in quotes	Say “add item” to add more to your order.	Say add item to add more to your order

Natural Speech Patterns

Direct address: Include commas before names
- ✓ “Hello, Maria! We have a special offer today.”
- ✗ “Hello Maria We have a special offer today”
Lists: Insert commas between items
- ✓ “Would you like fries, a drink, or an apple pie?”
- ✗ “Would you like fries a drink or an apple pie”
Conversational flow: Use short, standalone phrases
- ✓ “One moment. I’m searching for that information.”
- ✗ Long combined sentences without pauses

Special Formatting

Technique	Example
Hyphens for additional pauses	Your total is $45.82 - please pull forward.
Clear step boundaries	Please arrive early. Bring your insurance card - and medications.

Context Adaptation

Professional tone: “We’ve processed your refund according to company policy.”
Casual tone: “Good news! We’ve processed your refund - money should be back soon.”

Common Pitfalls

❌ Missing punctuation
❌ Run-on sentences
❌ Inconsistent formatting
❌ Unexplained abbreviations
❌ Overusing emphasis (!!!, ALL CAPS)
❌ No space before ? after URLs/emails
❌ Insufficient pauses for complex information

Testing Your Text

Read your text aloud naturally
Mark where you naturally pause
Add punctuation to match these pauses
Test variations with Aura-2 to find the most natural output

Remember: Natural text input produces natural speech output. The formatting choices you make directly impact how Aura-2 interprets and vocalizes your content.