tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Google Gemini API > Gemini Structured Output with JSON Schema

Gemini Structured Output with JSON Schema

Author: Venkata Sudhakar

Gemini structured output lets you define an exact JSON schema that the model must follow when generating its response. Instead of hoping the model returns valid JSON and writing fragile parsing code, you declare the schema upfront and Gemini guarantees the output matches it � correct field names, correct types, no extra text, no markdown fences. This is the right approach for any use case where the output feeds into a downstream system: data extraction pipelines, database insertion, API responses, or any business automation workflow that depends on consistent structure.

Structured output is configured via response_schema in GenerateContentConfig. You define the schema using google.genai.types Schema objects or equivalently using Python dataclasses with the typing module. The model returns a JSON string that always matches the schema � you can safely call json.loads() without a try/except. Fields can be required or optional, typed as STRING, NUMBER, BOOLEAN, ARRAY, or OBJECT, and nested to any depth. Use response_mime_type of application/json alongside the schema to ensure clean JSON output.

The below example builds a product review extraction pipeline that takes unstructured customer review text and reliably extracts rating, sentiment, product features mentioned, and recommended improvements � always as valid structured JSON.


Extracting structured data from unstructured review text,


It gives the following structured output with guaranteed schema compliance,

Review excerpt: The Sony WH-1000XM5 headphones are absolutely brilliant! ...
Rating:     5 / 5
Sentiment:  positive
Summary:    Exceptional noise-cancelling headphones with outstanding battery life.
Pros:       ["Best-in-class noise cancellation", "30+ hour battery life",
             "Excellent value for Rs 29,990"]
Cons:       ["Bulky carrying case"]
Recommend:  True
Category:   Audio / Headphones

Review excerpt: Samsung Galaxy Tab A9+ is decent but disappointing for the price. ...
Rating:     3 / 5
Sentiment:  mixed
Summary:    A decent tablet for basic use but underperforms for multitasking and gaming.
Pros:       ["Bright and clear display", "Good for video streaming"]
Cons:       ["Sluggish multitasking", "4GB RAM insufficient", "Gaming frame drops"]
Recommend:  False
Category:   Tablets

All 2 reviews extracted - ready for database insertion
Fields guaranteed present: ["rating", "sentiment", "summary", "pros",
                            "cons", "recommend", "product_category"]

# Schema-enforced output: no missing fields, no wrong types, no markdown
# json.loads() called directly with no error handling needed

Structured output use cases: product data extraction from unstructured supplier sheets, customer feedback categorisation at scale, invoice parsing into structured records, document classification pipelines, and any ETL workflow where LLM output feeds directly into a database or API. Always define required fields in the schema � optional fields may be omitted by the model. For nested objects, define a full Schema hierarchy rather than using generic OBJECT types. Test your schema against 10 to 20 real examples before production deployment to verify the model produces sensible values for inferred fields like rating and sentiment.


 
  


  
bl  br