|
|
OpenAI Vision API - Analysing Images with GPT-4o
Author: Venkata Sudhakar
GPT-4o and GPT-4o-mini support vision input - you can pass images directly in the API request alongside text. This opens up powerful use cases for e-commerce: automated product listing from photos, damage assessment for return processing, label reading for inventory management, and visual quality control. The OpenAI Vision API accepts images as URLs or base64-encoded data. Vision requests use the same Chat Completions API endpoint as text requests. You add image_url content blocks to the user message alongside text. The model processes both simultaneously, enabling questions like "What is the condition of this product?" with an attached photo - without any custom computer vision pipeline. The below example uses the OpenAI Vision API to automate two ShopMax India workflows: extracting product details from a product image URL and assessing return eligibility from a damage photo.
It gives the following output,
Product Details:
Product: Decorative graphic icon
Category: Digital asset / sticker
Features: Circular design, yellow colour, simple expression
Suggested price range: Rs 0 - Rs 50 (digital download)
Return Assessment:
Condition: MINT
Justification: The item shows no visible damage, scratches,
or wear - it appears to be in perfect original condition
and qualifies for a full return under ShopMax policy.
Replace the Wikipedia image URLs with real product and damage photos from your ShopMax inventory system or customer uploads. For production use, pass base64-encoded images for private product photos that are not publicly accessible. Use detail="high" for images requiring fine-grained analysis like serial numbers or labels, and detail="low" for general product categorisation to reduce token costs.
|
|