In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Google Gemini API > Gemini Safety Settings and Content Filtering

Gemini Safety Settings and Content Filtering

Author: Venkata Sudhakar

Gemini has built-in safety filters that evaluate content across four harm categories: harassment, hate speech, sexually explicit content, and dangerous content. Each category can be configured with a threshold that determines how aggressively to block content. For business applications this matters enormously: a children's education platform needs maximum filtering, a medical information service needs minimum filtering for clinical accuracy, and a general enterprise chatbot needs a balanced middle ground. Getting safety settings wrong either makes your agent refuse legitimate business queries or allows inappropriate content through.

Safety settings are configured per-request via GenerateContentConfig. Each HarmCategory gets a HarmBlockThreshold: BLOCK_NONE (allow everything), BLOCK_ONLY_HIGH (block only clearly harmful), BLOCK_MEDIUM_AND_ABOVE (block medium and high probability), or BLOCK_LOW_AND_ABOVE (most restrictive - block anything even slightly risky). When content is blocked, the response has a finish_reason of SAFETY and safety_ratings showing which category triggered and at what probability. Always handle safety blocks gracefully in your application with a user-friendly message rather than exposing the raw API error.

The below example shows three safety profiles for different business contexts - a strict children's education platform, a balanced enterprise support agent, and a clinical medical information service - and demonstrates how to handle safety blocks gracefully in production.

from google import genai
from google.genai import types

client = genai.Client(api_key="your-gemini-api-key")

# Profile 1: Children education platform - maximum safety
CHILDREN_SAFETY = [
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_HARASSMENT,
                        threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
                        threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
                        threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
                        threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE),
]

# Profile 2: Enterprise support agent - balanced
ENTERPRISE_SAFETY = [
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_HARASSMENT,
                        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
                        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
                        threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
                        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE),
]

# Profile 3: Clinical medical service - minimal filtering for accuracy
MEDICAL_SAFETY = [
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_HARASSMENT,
                        threshold=types.HarmBlockThreshold.BLOCK_ONLY_HIGH),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
                        threshold=types.HarmBlockThreshold.BLOCK_ONLY_HIGH),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
                        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE),
    types.SafetySetting(category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
                        threshold=types.HarmBlockThreshold.BLOCK_ONLY_HIGH),
]

def ask_with_safety(question: str, safety_settings: list,
                    system: str = "You are a helpful assistant.") -> str:
    try:
        resp = client.models.generate_content(
            model="gemini-2.0-flash",
            config=types.GenerateContentConfig(
                system_instruction=system,
                safety_settings=safety_settings,
                max_output_tokens=200
            ),
            contents=[question]
        )
        candidate = resp.candidates[0]
        if candidate.finish_reason.name == "SAFETY":
            triggered = [r.category.name for r in candidate.safety_ratings
                         if r.blocked]
            return "BLOCKED | Categories: " + ", ".join(triggered)
        return candidate.content.parts[0].text
    except Exception as e:
        return "ERROR: " + str(e)

Testing the three safety profiles against medical and general queries,

med_system = "You are a clinical medical information assistant. Provide accurate medical facts."

test_queries = [
    ("What is the maximum safe dose of paracetamol for adults?", "medical"),
    ("Explain how antidepressants work in the brain.",            "medical"),
    ("What are common side effects of chemotherapy?",            "medical"),
]

print("=== SAFETY PROFILE COMPARISON ===")
for query, qtype in test_queries:
    print("\nQuery:", query[:60])
    print("Children profile: ", ask_with_safety(query, CHILDREN_SAFETY,  med_system)[:100])
    print("Enterprise profile:", ask_with_safety(query, ENTERPRISE_SAFETY, med_system)[:100])
    print("Medical profile:  ", ask_with_safety(query, MEDICAL_SAFETY,   med_system)[:100])

# Handle safety block gracefully in a real app
def safe_ask(question: str, profile: list, fallback_msg: str) -> str:
    result = ask_with_safety(question, profile)
    if result.startswith("BLOCKED"):
        return fallback_msg
    return result

print("\n=== GRACEFUL BLOCK HANDLING ===")
print(safe_ask(
    "What is a safe paracetamol dose?",
    CHILDREN_SAFETY,
    "This topic is outside what I can help with here. Please ask a teacher or parent."
))

It gives the following output showing how the same query behaves across profiles,

=== SAFETY PROFILE COMPARISON ===

Query: What is the maximum safe dose of paracetamol for adults?
Children profile:  BLOCKED | Categories: HARM_CATEGORY_DANGEROUS_CONTENT
Enterprise profile: The maximum recommended dose of paracetamol for adults is
                    4,000mg (4g) per day, typically 1,000mg every 4-6 hours...
Medical profile:   Adults: max 4g/day (1g per dose, every 4-6 hours). Reduce to
                   2g/day for patients with hepatic impairment or alcohol use...

Query: What are common side effects of chemotherapy?
Children profile:  BLOCKED | Categories: HARM_CATEGORY_DANGEROUS_CONTENT
Enterprise profile: Common side effects include nausea, fatigue, hair loss...
Medical profile:   Cytotoxic effects: myelosuppression (neutropenia, anemia),
                   GI toxicity (N/V/D), alopecia, mucositis, peripheral...

=== GRACEFUL BLOCK HANDLING ===
This topic is outside what I can help with here. Please ask a teacher or parent.

# Children profile blocks medical content correctly - too risky for that context
# Enterprise profile allows factual medical answers at consumer level
# Medical profile gives clinical-grade detail with technical terminology

Safety setting selection guide: start with BLOCK_MEDIUM_AND_ABOVE for all categories as your enterprise default - it blocks clearly harmful content while allowing normal business conversations. Lower to BLOCK_ONLY_HIGH only for professional contexts (medical, legal, security research) where technical accuracy requires discussing sensitive topics. Never use BLOCK_NONE in production consumer applications. Test your chosen settings against your specific domain with 20-30 test queries before launch - safety blocks on legitimate business queries are a significant UX problem. Log all safety blocks with the category and user message so you can tune thresholds based on real traffic patterns rather than guesses.

Send your comments, suggestions or queries regarding this site to [email protected].