tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Google Gemini API > PII Detection and Redaction Agent

PII Detection and Redaction Agent

Author: Venkata Sudhakar

Personally Identifiable Information (PII) redaction is a compliance requirement for any system that processes customer data. In India, PII includes Aadhaar numbers, PAN cards, mobile numbers, email addresses, and bank account details. Logging or storing raw PII violates data protection obligations and creates security risk.

In this tutorial, we build a ShopMax India PII detection agent using Gemini. The agent scans incoming customer support transcripts, identifies PII entities, replaces them with safe placeholders, and returns a clean version safe for logging and analytics.

The below example shows a PII detection tool combined with a Gemini agent that processes customer chat transcripts.


Now run the agent on a sample customer support transcript,


It gives the following output,

PII Detection Report - ShopMax India Compliance

PII found in transcript:
  - aadhaar: 1 instance
  - mobile: 1 instance
  - email: 1 instance
  - pan: 1 instance

Redacted transcript (safe for logging):
---
Customer: Hi, my name is Rajesh Kumar and my order is delayed.
My Aadhaar is [REDACTED_AADHAAR] and mobile is [REDACTED_MOBILE].
Please check and call me back. My email is [REDACTED_EMAIL].
My PAN is [REDACTED_PAN] for billing verification.
Agent: Thank you Rajesh. I will escalate this immediately.
---

Status: NOT safe to log in original form. Redacted version is safe for storage.

Run this agent as a pre-processing step before writing any customer data to logs, databases, or analytics pipelines. Extend PII_PATTERNS to cover additional Indian identifiers such as voter ID or GSTIN as needed. For production, combine with Gemini's native entity extraction for higher recall on freeform text.


 
  


  
bl  br