|
|
Gemini Batch Processing
Author: Venkata Sudhakar
Gemini Batch Processing allows you to submit large volumes of requests asynchronously and retrieve results when complete. Instead of sending thousands of individual API calls, batch mode accepts a JSONL input file, processes all requests in the background, and writes results to Cloud Storage. ShopMax India uses batch processing to generate product descriptions for its entire 10,000-item catalogue overnight. Batch jobs are submitted via the Vertex AI Batch Prediction API using the Gemini model endpoint. Each request in the JSONL file is independent. Batch mode is up to 50% cheaper than online inference and has no rate limit concerns, making it ideal for large-scale content generation, classification, and analysis tasks. The below example shows how to prepare a JSONL batch input file and submit a batch job to Vertex AI.
It gives the following output,
Batch input file created with 3 requests
The below example shows uploading the input to Cloud Storage, submitting the batch job, and retrieving results when complete.
It gives the following output,
Job submitted: shopmax-product-descriptions
State: JOB_STATE_RUNNING
...
Completed. Output: gs://shopmax-batch-jobs/outputs/prediction-shopmax-product-descriptions/
ShopMax India runs a nightly batch job every day at 2 AM to generate descriptions for new catalogue additions and refresh existing ones. Processing 10,000 products takes under 90 minutes at batch pricing, costing 50% less than equivalent online API calls. Results are written directly to Cloud Storage and picked up by the product database pipeline at 4 AM before the business day begins.
|
|