In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Google Gemini API > ADK Agent Monitoring with Custom Metrics

ADK Agent Monitoring with Custom Metrics

Author: Venkata Sudhakar

Production ADK agents need observability beyond basic logging. Custom Cloud Monitoring metrics let you track what matters to the business - how many tool calls succeed, how often the agent escalates to a human, and what the average response latency is per agent type. These metrics feed into dashboards and alerting policies.

ShopMax India runs five specialised agents in production: a product recommendation agent, a returns agent, a pricing agent, a logistics agent, and an escalation agent. Custom metrics help the operations team see which agents are under load, which are failing silently, and where to focus optimisation effort.

The below example shows how to wrap an ADK agent runner with Cloud Monitoring metric instrumentation.

import time
import vertexai
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.cloud import monitoring_v3

vertexai.init(project="shopmax-india", location="asia-south1")
PROJECT_ID = "shopmax-india"
client = monitoring_v3.MetricServiceClient()
project_name = f"projects/{PROJECT_ID}"

def write_metric(metric_type: str, value: float, labels: dict):
    series = monitoring_v3.TimeSeries()
    series.metric.type = f"custom.googleapis.com/adk/{metric_type}"
    for k, v in labels.items():
        series.metric.labels[k] = v
    series.resource.type = "global"

point = monitoring_v3.Point()
    point.value.double_value = value
    now = time.time()
    point.interval.end_time.seconds = int(now)
    series.points = [point]

client.create_time_series(
        name=project_name,
        time_series=[series]
    )

def run_with_metrics(agent_name: str, user_id: str, message: str) -> str:
    start = time.time()
    result = None
    tool_calls = 0

agent = LlmAgent(
        model="gemini-2.0-flash",
        name=agent_name,
        instruction="You are a ShopMax India customer service agent."
    )
    session_service = InMemorySessionService()
    runner = Runner(
        agent=agent,
        app_name="shopmax_agents",
        session_service=session_service
    )
    session = session_service.create_session(
        app_name="shopmax_agents", user_id=user_id
    )

for event in runner.run(
        user_id=user_id,
        session_id=session.id,
        new_message={"role": "user", "parts": [{"text": message}]}
    ):
        if hasattr(event, "tool_call") and event.tool_call:
            tool_calls += 1
        if event.is_final_response():
            result = event.content.parts[0].text

latency = time.time() - start
    labels = {"agent_name": agent_name, "user_id": user_id}

write_metric("response_latency_seconds", latency, labels)
    write_metric("tool_calls_per_request", tool_calls, labels)
    write_metric("requests_total", 1.0, labels)

print(f"Agent: {agent_name}, Latency: {latency:.2f}s, Tools: {tool_calls}")
    return result

response = run_with_metrics(
    agent_name="returns_agent",
    user_id="CUST-5821",
    message="I want to return my order ORD-9182 placed last week"
)
print(response)

It gives the following output,

Agent: returns_agent, Latency: 1.84s, Tools: 2
I have initiated the return for order ORD-9182. A pickup will be scheduled
within 48 hours. You will receive Rs 4,999 refund within 5-7 business days.

The below example shows how to create a Cloud Monitoring alerting policy that fires when agent latency exceeds a threshold.

from google.cloud import monitoring_v3

def create_latency_alert(project_id: str, agent_name: str, threshold_seconds: float):
    alert_client = monitoring_v3.AlertPolicyServiceClient()
    project_name = f"projects/{project_id}"

condition = monitoring_v3.AlertPolicy.Condition(
        display_name=f"{agent_name} high latency",
        condition_threshold=monitoring_v3.AlertPolicy.Condition.MetricThreshold(
            filter=(
                f'metric.type="custom.googleapis.com/adk/response_latency_seconds" '
                f'AND metric.labels.agent_name="{agent_name}"'
            ),
            comparison=monitoring_v3.ComparisonType.COMPARISON_GT,
            threshold_value=threshold_seconds,
            duration={"seconds": 60},
            aggregations=[
                monitoring_v3.Aggregation(
                    alignment_period={"seconds": 60},
                    per_series_aligner=monitoring_v3.Aggregation.Aligner.ALIGN_MEAN
                )
            ]
        )
    )

policy = monitoring_v3.AlertPolicy(
        display_name=f"ADK {agent_name} SLA breach",
        conditions=[condition],
        combiner=monitoring_v3.AlertPolicy.ConditionCombinerType.AND,
        notification_channels=[]
    )

created = alert_client.create_alert_policy(
        name=project_name, alert_policy=policy
    )
    print(f"Alert policy created: {created.name}")

create_latency_alert("shopmax-india", "returns_agent", threshold_seconds=3.0)

It gives the following output,

Alert policy created: projects/shopmax-india/alertPolicies/1234567890
Alert will fire when returns_agent p50 latency exceeds 3.0s for 60 seconds

Custom metrics give ShopMax India visibility into agent performance at the business level. When the returns agent starts taking longer than 3 seconds on average, the operations team is alerted before customers notice. The tool_calls_per_request metric also helps identify when agents are over-calling external services, which directly impacts API costs and response quality.

Send your comments, suggestions or queries regarding this site to [email protected].