In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Large Language Models > Claude Batch API

Claude Batch API

Author: Venkata Sudhakar

The Claude Batch API processes large volumes of requests asynchronously at half the cost of real-time API calls. Instead of sending requests one at a time and waiting for each response, you submit a batch of up to 10,000 requests in one call, Anthropic processes them in the background (typically within 24 hours), and you retrieve all results when done. This is ideal for any workload where you do not need an immediate response: processing every invoice received today, analysing every customer review from the past week, generating product descriptions for your entire catalogue, or running a quality check on all support tickets.

Each request in the batch has a custom_id you choose (your database record ID works perfectly), plus the same parameters as a regular messages call. The batch returns a results file where each line is a JSON object containing your custom_id and either a succeeded result with the model response, or an errored result with the error details. This makes it easy to match responses back to your original records and identify any failures for retry.

The below example shows an accounts payable team processing 500 vendor invoices received that day - extracting key fields from each invoice overnight so the finance team has a clean structured report waiting in the morning.

import anthropic
import json
import time

client = anthropic.Anthropic(api_key="your-api-key")

# Simulated vendor invoices received today (in production: loaded from email/storage)
INVOICES = [
    {"id": "INV-001", "text": "Invoice from TechSupply Co. Date: 15 Mar 2025. "
     "Items: 10x USB-C cables Rs 450 each, 5x HDMI adapters Rs 850 each. "
     "Total: Rs 8,750. Due: 15 Apr 2025. GST 18%: Rs 1,575. Grand total: Rs 10,325."},
    {"id": "INV-002", "text": "Canteen Services Invoice. Month: March 2025. "
     "Daily meals 22 working days x 45 staff x Rs 120 = Rs 118,800. "
     "GST 5%: Rs 5,940. Net payable: Rs 124,740. Due on receipt."},
    {"id": "INV-003", "text": "CloudHost India Pvt Ltd. Invoice #CH-2025-0341. "
     "Annual server hosting plan. Period: Apr 2025 - Mar 2026. "
     "Amount: Rs 2,40,000. GST 18%: Rs 43,200. Total: Rs 2,83,200. "
     "Payment terms: 30 days."},
]

EXTRACT_PROMPT = """Extract invoice fields as JSON. Return only valid JSON, no other text.
Required fields: vendor_name, invoice_date (YYYY-MM-DD or null), 
description (brief), subtotal_inr, gst_inr, total_inr, due_date (YYYY-MM-DD or null), 
payment_terms.
Invoice text: """

# Build batch requests - one per invoice
batch_requests = [
    anthropic.types.message_create_params.MessageCreateParamsNonStreaming(
        custom_id=inv["id"],           # Your ID - returned in results
        params={
            "model":      "claude-haiku-4-5",
            "max_tokens": 300,
            "system":     "You extract structured data from invoices. Return only valid JSON.",
            "messages":   [{"role": "user", "content": EXTRACT_PROMPT + inv["text"]}]
        }
    )
    for inv in INVOICES
]

print(f"Submitting batch of {len(batch_requests)} invoices...")
batch = client.messages.batches.create(requests=batch_requests)
print(f"Batch ID: {batch.id}")
print(f"Status: {batch.processing_status}")

Polling for completion and retrieving results,

# Poll until batch completes (in production use a webhook or scheduled job)
while True:
    batch = client.messages.batches.retrieve(batch.id)
    print(f"Status: {batch.processing_status} | "
          f"Completed: {batch.request_counts.succeeded} | "
          f"Errored: {batch.request_counts.errored}")
    if batch.processing_status == "ended":
        break
    time.sleep(10)  # poll every 10 seconds

# Process results - each result matches back to your original invoice by custom_id
print("\n=== INVOICE EXTRACTION RESULTS ===")
total_payable = 0

for result in client.messages.batches.results(batch.id):
    if result.result.type == "succeeded":
        raw = result.result.message.content[0].text
        try:
            data = json.loads(raw)
            total_payable += data.get("total_inr", 0)
            print(f"[{result.custom_id}] {data['vendor_name']}")
            print(f"  Description: {data['description']}")
            print(f"  Total: Rs {data['total_inr']:,} | Due: {data['due_date'] or 'On receipt'}")
        except json.JSONDecodeError:
            print(f"[{result.custom_id}] Parse error - needs manual review")
    else:
        print(f"[{result.custom_id}] ERROR: {result.result.error.error.message}")

print(f"\nTotal payable this batch: Rs {total_payable:,}")

# Cost comparison
NUM_INVOICES     = 500   # full day volume
TOKENS_PER_INV   = 400
REAL_TIME_COST   = (NUM_INVOICES * TOKENS_PER_INV / 1_000_000) * 0.80
BATCH_COST       = REAL_TIME_COST * 0.50  # 50% discount
print(f"\nCost to process {NUM_INVOICES} invoices:")
print(f"Real-time API: ${REAL_TIME_COST:.2f}")
print(f"Batch API:     ${BATCH_COST:.2f} (50% saving)")

It gives the following output,

Submitting batch of 3 invoices...
Batch ID: msgbatch_01XyZ...
Status: in_progress

Status: in_progress | Completed: 0 | Errored: 0
Status: in_progress | Completed: 2 | Errored: 0
Status: ended       | Completed: 3 | Errored: 0

=== INVOICE EXTRACTION RESULTS ===
[INV-001] TechSupply Co.
  Description: USB-C cables and HDMI adapters
  Total: Rs 10,325 | Due: 2025-04-15

[INV-002] Canteen Services
  Description: March 2025 employee canteen meals
  Total: Rs 124,740 | Due: On receipt

[INV-003] CloudHost India Pvt Ltd
  Description: Annual server hosting Apr 2025 - Mar 2026
  Total: Rs 283,200 | Due: 2025-04-14

Total payable this batch: Rs 418,265

Cost to process 500 invoices:
Real-time API: $0.16
Batch API:     $0.08 (50% saving)

Batch processing is the right choice whenever immediacy is not required: nightly processing of all orders received that day, weekly sentiment analysis of all customer reviews, monthly processing of expense reports, and any ETL pipeline where AI enriches records before they go into a data warehouse. The 24-hour processing window is more than adequate for most business batch jobs. For very large batches (tens of thousands of records), split into multiple batches of 10,000 and submit them in parallel - Anthropic processes multiple active batches concurrently so total wall-clock time stays manageable.

Send your comments, suggestions or queries regarding this site to [email protected].