|
|
Vertex AI Model Garden with ADK
Author: Venkata Sudhakar
Vertex AI Model Garden is a curated catalogue of foundation models available on Google Cloud - not just Gemini but also Meta Llama, Mistral, Anthropic Claude, and many others. For ADK agents this means you are not locked to Gemini: you can build an ADK agent that uses Llama 3 for one task and Gemini for another, or swap the underlying model without changing any tool or session code. Model Garden provides unified endpoints for all models, IAM-controlled access, and enterprise billing through your GCP project. ADK agents accept any model string that the Vertex AI SDK supports. For Model Garden models you use the full Vertex AI endpoint string or the model publisher path. LiteLLM is the recommended adapter for using non-Gemini models in ADK - it provides a unified interface so ADK can call Llama, Mistral, or Claude using the same generate_content pattern. Once configured, swapping models is a one-line change in the Agent definition with no changes to tools, sessions, callbacks, or deployment code. The below example builds an ADK agent comparison framework that runs the same customer service query through Gemini, Llama 3, and Claude on Vertex AI - measuring response quality and latency to help you choose the right model for your production use case.
Benchmarking all three models on the same queries,
It gives the following comparison output across all three Model Garden models,
=== MODEL: gemini-2.0-flash ===
Q: Where is my order ORD-88421?
A: Your order ORD-88421 is out for delivery and expected today by 7pm!
Latency: 842 ms
Q: I cannot find my order ORD-99999 in the system.
A: I checked and ORD-99999 was not found. Could you double-check
the order ID from your confirmation email?
Latency: 761 ms
=== MODEL: meta/llama3-70b-instruct-maas ===
Q: Where is my order ORD-88421?
A: ORD-88421 is currently out for delivery with an ETA of today by 7pm.
Latency: 1240 ms
Q: I cannot find my order ORD-99999 in the system.
A: I was unable to locate order ORD-99999. Please verify the order ID.
Latency: 1180 ms
=== MODEL: claude-3-5-sonnet@20241022 ===
Q: Where is my order ORD-88421?
A: Great news! Order ORD-88421 is out for delivery and should arrive
today by 7pm. Is there anything else I can help you with?
Latency: 1050 ms
# All three models called the get_order_status tool correctly
# Gemini fastest for this use case; Claude most conversational tone
# Model swap = one line change in Agent definition
# Same tools, sessions, callbacks work with all three models
Model routing selects the right Model Garden model per use case,
Customer service model: gemini-2.0-flash
Analysis model: claude-3-5-sonnet@20241022
Open source model: meta/llama3-70b-instruct-maas
# All models accessed through the same Vertex AI endpoint
# Billing consolidated in one GCP project
# IAM controls which service accounts can use which models
# No separate API keys needed for Llama or Claude on Vertex AI
Model Garden selection criteria: use Gemini 2.0 Flash as your default for customer-facing agents - it is the fastest, cheapest, and most well-integrated with ADK features like built-in search and code execution. Use Claude Sonnet for complex reasoning tasks like contract analysis, financial modelling, or multi-step planning where response quality matters more than latency. Use Llama 3 when you have regulatory or contractual requirements to use open-source models, or when you need to fine-tune the model on proprietary data. Enable Model Garden access in GCP Console under Vertex AI before using non-Gemini models - some models require accepting publisher terms of service before the first call.
|
|