In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > AI Security > Adversarial Input Sanitization for AI Chatbots

Adversarial Input Sanitization for AI Chatbots

Author: Venkata Sudhakar

AI chatbots face a constant stream of adversarial inputs - messages deliberately crafted to bypass content policies, extract hidden system instructions, or cause the model to behave in ways that violate business rules. For ShopMax India's customer support chatbot, adversarial inputs might include jailbreak attempts to make the bot say competitors are better, Unicode tricks that visually mimic safe text but confuse the model, or excessive repetition designed to stress-test response generation. Sanitizing inputs before they reach the LLM is a critical first line of defense.

Input sanitization for AI chatbots involves multiple steps: length enforcement to prevent excessively long inputs that waste tokens or attempt to flood the context window, Unicode normalization to collapse lookalike characters (e.g., Cyrillic 'a' vs ASCII 'a') that bypass keyword filters, HTML and script tag stripping to prevent cross-site scripting if responses are rendered in a browser, repetition detection to block inputs that repeat the same phrase dozens of times, and keyword-based content policy filtering for language that violates the platform's terms of service.

The following sanitization pipeline for ShopMax India's chatbot applies all these checks in sequence. Each step either cleans the input or raises a rejection with an informative reason so the user can correct their query.

import re
import unicodedata
from openai import OpenAI

client = OpenAI(api_key="sk-...")

MAX_LENGTH = 500
MAX_REPEAT_RATIO = 0.4

BLOCKED_TERMS = ["competitor", "flipkart is better", "amazon is better", "shut down", "illegal"]

def sanitize(user_input: str) -> tuple:
    # Length check
    if len(user_input) > MAX_LENGTH:
        return None, "Input too long. Please keep your message under 500 characters."

# Unicode normalization
    normalized = unicodedata.normalize("NFKC", user_input)

# Strip HTML tags
    clean = re.sub(r"<[^>]+>", "", normalized)

# Repetition detection
    words = clean.lower().split()
    if len(words) > 5:
        unique_ratio = len(set(words)) / len(words)
        if unique_ratio < MAX_REPEAT_RATIO:
            return None, "Your message appears to contain excessive repetition. Please rephrase."

# Content policy
    lower = clean.lower()
    for term in BLOCKED_TERMS:
        if term in lower:
            return None, "That topic is outside the scope of ShopMax India support."

return clean.strip(), None

def chat(user_input: str) -> str:
    clean_input, error = sanitize(user_input)
    if error:
        return "Sorry: " + error

response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a helpful assistant for ShopMax India."},
            {"role": "user", "content": clean_input}
        ]
    )
    return response.choices[0].message.content

tests = [
    "What is the warranty on LG OLED TVs?",
    "buy buy buy buy buy buy buy buy buy buy buy buy",
    "<script>alert(1)</script> What is the return policy?",
    "Is Amazon better than ShopMax?",
    "A" * 600
]

for t in tests:
    print("Input:", t[:60], "...")
    print("Response:", chat(t))
    print()

It gives the following output,

Input: What is the warranty on LG OLED TVs? ...
Response: LG OLED TVs come with a 1-year manufacturer warranty at ShopMax India...

Input: buy buy buy buy buy buy buy buy buy buy buy buy ...
Response: Sorry: Your message appears to contain excessive repetition. Please rephrase.

Input: <script>alert(1)</script> What is the return policy? ...
Response: ShopMax India offers a 10-day return policy on all electronics...

Input: Is Amazon better than ShopMax? ...
Response: Sorry: That topic is outside the scope of ShopMax India support.

Input: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ...
Response: Sorry: Input too long. Please keep your message under 500 characters.

In production, run sanitization as a FastAPI middleware layer so it applies to every route automatically without cluttering business logic. Keep the blocked terms list in a database so the support team can update it without a code deployment. For ShopMax India's multilingual user base spanning Hindi, Tamil, and Telugu speakers, apply Unicode normalization before keyword matching to ensure Devanagari or Tamil script inputs are correctly evaluated. Log all sanitization rejections with the original input (hashed for privacy) to track attack patterns over time.

Send your comments, suggestions or queries regarding this site to [email protected].