|
|
Document Intelligence Agent - Q and A over PDFs
Author: Venkata Sudhakar
Enterprise teams spend hours reading through lengthy reports, contracts, and policy documents to find specific information. A document intelligence agent powered by Gemini can ingest an entire PDF, understand its structure, and answer precise questions with direct citations - functioning like a personal research assistant for any document.
In this tutorial, we build a document Q and A agent for ShopMax India. The agent ingests a business document (annual report or vendor contract) using Gemini Files API, then answers questions about it with page references and quoted evidence.
The below example shows the agent processing a vendor contract PDF and answering compliance questions with citations.
It gives the following output,
Uploading document to Gemini Files API...
File uploaded: files/abc123xyz
File URI: https://generativelanguage.googleapis.com/v1beta/files/abc123xyz
Q: What is the contract value and payment terms?
A: Per Clause 4.1 (Financial Terms): The total contract value is Rs 48,00,000
(Forty-Eight Lakhs) for Q1 2025. Payment terms are net-30 from invoice date.
Clause 4.3 specifies: "Invoices must be submitted within 7 days of delivery
acceptance. Payment shall be made within 30 calendar days of invoice receipt."
Q: What are the penalty clauses for delivery delays?
A: Clause 8.2 (Liquidated Damages) states: "In the event of delivery delay
beyond the agreed schedule, the Vendor shall be liable for liquidated damages
at the rate of 0.5% of the delayed order value per day, subject to a maximum
cap of 10% of the total contract value."
For this contract, the maximum penalty is Rs 4,80,000.
Q: What is the dispute resolution mechanism?
A: Clause 14 (Dispute Resolution) specifies a three-tier process:
1. Amicable settlement within 30 days of written notice
2. Mediation through Mumbai Centre for International Arbitration (MCIA)
3. Arbitration under Indian Arbitration and Conciliation Act 1996
Jurisdiction: Mumbai High Court. Governing law: Laws of India.
Files API cleanup output,
File files/abc123xyz deleted from Files API
The Gemini Files API supports PDFs up to 2GB and retains files for 48 hours. For persistent document stores, re-upload files at session start or use Vertex AI RAG Engine for long-term indexed document retrieval. To handle multi-document questions (compare two contracts), pass multiple doc_part objects in the same contents list - Gemini can reason across all of them simultaneously.
|
|