Redaction
redact string Default: False
Language Support
Redaction language support varies by deployment type and processing method:
Enable Feature
To enable redaction, use the following parameter in the query string when you call Deepgram’s /listen endpoint:
redact=OPTION
Redacting Common Entities
Deepgram provides the following options to redact common groups of entities:
pci: Redacts credit card information, including credit card number, expiration date, and CVV.pii: Redacts a broad range of personally identifiable information, including names, locations, and identifying numbers.phi: Redacts protected health information, including medical conditions, drugs, injuries, blood types, medical processes, and statistics.numbers(ortrueoraggressive_numbers): Redacts numerical and identifying entities, including dates, account numbers, credit cards, SSNs, and more.- Multiple redaction values can be sent:
redact=pci&redact=numbers
To see exactly which entity types are included in each group, refer to the Redaction Groups column in the Supported Entity Types table.
Redacting Specific Entities
You may select the types of entities you wish to redact from over 50 supported entity types. This powerful functionality allows you total control over what is redacted in your transcript.
Some options include credit_card, credit_card_expiration, cvv, and email_address.
View all options here: Supported Entity Types
Pre-Recorded Examples
You can enable redaction by adding redact=OPTION as a query parameter.
To transcribe audio and remove PCI data from an audio file run the following cURL command:
Multiple types of entities can be redacted with the syntax redact=option_1&redact=option_2. To transcribe audio and remove PCI and PII data from an audio file run the following cURL command:
Replace YOUR_DEEPGRAM_API_KEY with your Deepgram API Key.
Streaming Examples
To ensure redaction operates with the highest accuracy, set no_delay=false or avoid including no_delay altogether. If no_delay=true is set, our system will opt for low latency at the risk of redaction performance.
In streaming redaction, Deepgram follows a two-phase approach. During interim results, the system returns a generic [REDACTED] placeholder for a redacted entity while it continues evaluating the spoken content. Once a segment is considered complete and Deepgram has high confidence in the detected entity, the placeholder is replaced with a specific entity tag (for example, [CREDIT_CARD_1], [SSN_1], or [PHONE_NUMBER_1]). This replacement may occur in a later interim result or in the final result. This approach enables real-time transcription with both low latency and accurate, contextual redaction.
Results
For both Live-streaming and Pre-recorded audio, Redaction replaces redacted content with the type of entity redacted and the number of times that entity has been detected in the transcript. For example, if you choose to redact social security numbers, the phrase “My social security number is five five five two two one one one one and his is six six six two two one three three three” would appear in your transcript as “My social security number is [SSN_1] and his is [SSN_2]”.
Example with redact=pci&redact=pii: