tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > OpenAI API > OpenAI Moderation API - Content Safety Filtering

OpenAI Moderation API - Content Safety Filtering

Author: Venkata Sudhakar

The OpenAI Moderation API detects harmful content in text across categories such as hate speech, harassment, self-harm, sexual content, and violence. ShopMax India uses the Moderation API to automatically screen customer-submitted product reviews and Q&A posts before they appear on the website, preventing abusive or off-topic content from reaching other shoppers.

The API call is client.moderations.create(), accepting input (a string or list of strings) and an optional model (omni-moderation-latest for the most accurate results, or text-moderation-latest for faster, lower-cost screening). The response includes a flagged boolean and a categories object showing which specific categories triggered, plus category_scores showing confidence values between 0 and 1 for each category.

The below example shows ShopMax India screening a batch of customer-submitted product reviews, flagging harmful ones, and logging the results for a content moderation queue.


It gives the following output,

Screening 4 reviews...

REV-1001: APPROVED
REV-1002: FLAGGED
  Categories: ['harassment']
REV-1003: APPROVED
REV-1004: FLAGGED
  Categories: ['harassment', 'harassment/threatening']

Use omni-moderation-latest for user-generated content where accuracy matters, and text-moderation-latest for high-volume pre-screening to reduce latency and cost. Never block content based on moderation scores alone - route flagged items to a human review queue for final decisions to avoid false positives. Store category_scores alongside flagged content so moderators can see confidence levels. For multilingual platforms, pass content in the original language as the API handles multiple languages. Set score thresholds per category based on your platform policy rather than relying solely on the flagged boolean.


 
  


  
bl  br