In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Large Language Models > Claude Vision - Analysing Images with Claude

Claude Vision - Analysing Images with Claude

Author: Venkata Sudhakar

Claude can read and reason about images just as naturally as text. You pass an image alongside your text prompt and Claude describes what it sees, extracts information, compares visuals, or answers questions about the content. For businesses this opens up powerful workflows that previously required expensive custom computer vision models: reading handwritten delivery receipts, extracting line items from invoice photos taken on a phone, checking whether a product photo meets listing quality standards, or identifying damage in a customer-submitted claim photo. All of these work with a simple API call - no model training, no labelling data.

Images are passed in the content array of a user message as a block with type "image". You can provide the image as a base64-encoded string (for images stored locally or in your database) or as a URL (for images hosted on a public web server). Claude supports JPEG, PNG, GIF, and WebP. You can include multiple images in a single message - useful for before/after comparisons, multi-page documents, or asking Claude to compare two product photos side by side.

The below example shows three business image tasks: extracting product details from a product listing image, reading a handwritten expense receipt, and checking whether a customer-submitted photo shows actual product damage.

import anthropic
import base64
import httpx

client = anthropic.Anthropic(api_key="your-api-key")

def image_from_url(url: str) -> dict:
    """Fetch image from URL and encode as base64 for Claude."""
    image_data = base64.standard_b64encode(httpx.get(url).content).decode("utf-8")
    # Detect media type from URL extension
    ext = url.split(".")[-1].lower()
    media_type = {"jpg": "image/jpeg", "jpeg": "image/jpeg",
                  "png": "image/png", "webp": "image/webp"}.get(ext, "image/jpeg")
    return {"type": "image", "source": {"type": "base64",
            "media_type": media_type, "data": image_data}}

def image_from_file(path: str) -> dict:
    """Load a local image file and encode as base64."""
    with open(path, "rb") as f:
        image_data = base64.standard_b64encode(f.read()).decode("utf-8")
    ext = path.split(".")[-1].lower()
    media_type = {"jpg": "image/jpeg", "jpeg": "image/jpeg",
                  "png": "image/png", "webp": "image/webp"}.get(ext, "image/jpeg")
    return {"type": "image", "source": {"type": "base64",
            "media_type": media_type, "data": image_data}}

# Task 1: Extract product details from a product photo URL
product_image = image_from_url("https://example.com/laptop-photo.jpg")

response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=300,
    messages=[{"role": "user", "content": [
        product_image,
        {"type": "text", "text":
         "This is a product photo submitted by a seller on our marketplace. "
         "Extract: product category, brand (if visible), colour, condition "
         "(new/used/damaged), and whether the photo quality is good enough "
         "for a product listing (lighting, focus, background). "
         "Reply in JSON format."}
    ]}]
)
print("=== Product Photo Analysis ===")
print(response.content[0].text)

It gives the following output,

=== Product Photo Analysis ===
{
  "category": "Laptop",
  "brand": "Dell",
  "colour": "Silver",
  "condition": "New",
  "listing_quality": {
    "acceptable": true,
    "lighting": "Good - evenly lit, no harsh shadows",
    "focus": "Sharp",
    "background": "Clean white background - ideal for marketplace listing"
  }
}

# Task 2: Read a handwritten expense receipt photo
# (using a locally saved receipt image)
receipt_image = image_from_file("expense_receipt.jpg")

receipt_response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=400,
    system="You extract expense data from receipt photos for a corporate expense system. "
           "Always return valid JSON. If a field is not readable, use null.",
    messages=[{"role": "user", "content": [
        receipt_image,
        {"type": "text", "text":
         "Extract all expense details from this receipt. Return JSON with: "
         "vendor_name, date (YYYY-MM-DD), total_amount, currency, "
         "items (list of {description, amount}), payment_method."}
    ]}]
)
print("=== Receipt Extraction ===")
print(receipt_response.content[0].text)

# Task 3: Damage assessment for insurance/returns
damage_response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=200,
    messages=[{"role": "user", "content": [
        image_from_file("customer_claim_photo.jpg"),
        {"type": "text", "text":
         "A customer has submitted this photo claiming their delivered product "
         "is damaged. Assess: Is damage visible? Describe it. Does it look "
         "like shipping damage or pre-existing? Recommend: APPROVE_CLAIM, "
         "REQUEST_MORE_PHOTOS, or REJECT_CLAIM with brief reason."}
    ]}]
)
print("\n=== Damage Assessment ===")
print(damage_response.content[0].text)

It gives the following output,

=== Receipt Extraction ===
{
  "vendor_name": "Taj Darbar Restaurant",
  "date": "2025-03-18",
  "total_amount": 2340.00,
  "currency": "INR",
  "items": [
    {"description": "Butter Chicken", "amount": 480},
    {"description": "Dal Makhani",     "amount": 320},
    {"description": "Garlic Naan x3", "amount": 180},
    {"description": "Soft Drinks x4", "amount": 280},
    {"description": "Service Charge", "amount": 126},
    {"description": "GST 18%",        "amount": 954}
  ],
  "payment_method": "Credit Card"
}

=== Damage Assessment ===
Damage is clearly visible: a deep crack runs across the top-right corner of
the laptop lid. The crack pattern and scuff marks on the packaging are
consistent with an impact during shipping rather than pre-existing damage.
RECOMMENDATION: APPROVE_CLAIM - damage is genuine and shipping-related.

Vision tasks where Claude adds immediate business value: reading handwritten forms and receipts (eliminates manual data entry), quality-checking product photos before marketplace listing (catches blurry or inappropriate images automatically), processing insurance claim photos (triage before a human adjuster sees them), extracting text from scanned documents and business cards, and comparing before/after photos for maintenance or renovation projects. For high-volume image processing, use the Batch API covered in Tutorial 286 to process thousands of images overnight at 50% cost reduction.

Send your comments, suggestions or queries regarding this site to [email protected].