tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Large Language Models > LLM Structured Output with Instructor and Pydantic

LLM Structured Output with Instructor and Pydantic

Author: Venkata Sudhakar

ShopMax India extracts structured data from customer reviews, support tickets, and product descriptions using LLMs. Raw LLM text output is unreliable for downstream processing. The Instructor library wraps the OpenAI and Anthropic SDKs to guarantee Pydantic-validated JSON output, making structured extraction production-ready without custom parsing logic.

Instructor patches the LLM client to accept a response_model parameter - a Pydantic class defining the expected schema. It constructs the function call schema automatically, parses the response, and retries on validation errors. This removes the need for custom JSON parsing and handles edge cases like missing fields or wrong types transparently.

The below example shows how ShopMax India extracts structured return request details from customer support messages using Instructor and Pydantic.


It gives the following output,

{
  "order_id": "ORD-28847",
  "product_name": "Sony WH-1000XM5",
  "reason": "Left ear cup stopped working after 3 days",
  "city": "Delhi",
  "refund_amount_rs": 28000,
  "priority": "high"
}

Instructor automatically retries on validation errors up to a configurable max_retries limit - set max_retries=2 and log failed extractions for manual review at ShopMax India. Use Optional fields for data that may not always be present in customer messages. Instructor supports Anthropic Claude and Google Gemini in addition to OpenAI, so you can switch providers without changing your Pydantic schema definitions.


 
  


  
bl  br