In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Google Gemini API > ADK Health Checks and Readiness Probes

ADK Health Checks and Readiness Probes

Author: Venkata Sudhakar

Cloud Run health probes determine whether an ADK agent instance is ready to serve traffic and whether it remains alive during operation. Without probes, Cloud Run sends traffic to instances that are still loading (causing errors) or to instances that have entered a broken state (causing silent failures). ShopMax India configures startup, liveness, and readiness probes on all production agents to ensure zero-downtime deployments and automatic recovery from hung instances.

Cloud Run supports three probe types: startup probes (delay before first health check, used for slow-starting agents), liveness probes (kill and restart the instance if it fails), and readiness probes (remove the instance from the load balancer without killing it). Each probe can use an HTTP endpoint, a TCP port check, or a gRPC health check. For ADK agents, an HTTP /healthz endpoint is the simplest and most reliable approach.

The below example shows an ADK agent wrapped in a FastAPI server with health endpoints and the corresponding Cloud Run deployment configuration.

from fastapi import FastAPI
from google.adk.agents import LlmAgent
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
import time, os

app = FastAPI()
_startup_time = time.time()
_ready = False

# --- Health endpoints ---

@app.get("/healthz/live")
def liveness():
    """Liveness probe: returns 200 if the process is alive and not deadlocked."""
    return {"status": "alive", "uptime_seconds": int(time.time() - _startup_time)}

@app.get("/healthz/ready")
def readiness():
    """Readiness probe: returns 200 only after agent initialisation completes."""
    if not _ready:
        from fastapi import HTTPException
        raise HTTPException(status_code=503, detail="Agent not yet initialised")
    return {"status": "ready"}

# --- Agent initialisation ---

agent = LlmAgent(
    name="shopmax_support_agent",
    model="gemini-2.0-flash",
    instruction="You are a ShopMax India support agent. Help customers with orders and products.",
)
session_service = InMemorySessionService()
runner = Runner(agent=agent, session_service=session_service)
_ready = True  # signal readiness after all init is complete

# --- Inference endpoint ---

@app.post("/agent")
async def handle(body: dict):
    user_msg  = body.get("message", "")
    session   = body.get("session_id", "default")
    events    = list(runner.run(
        user_id="user", session_id=session,
        new_message={"role": "user", "parts": [{"text": user_msg}]},
    ))
    reply = next((e.content.parts[0].text for e in events if e.is_final_response()), "")
    return {"reply": reply}

It gives the following output,

# GET /healthz/live
{"status": "alive", "uptime_seconds": 142}

# GET /healthz/ready  (before init complete)
HTTP 503: {"detail": "Agent not yet initialised"}

# GET /healthz/ready  (after init complete)
{"status": "ready"}

It gives the following output,

Applying new configuration to Cloud Run service [shopmax-support-agent]...
Probes configured:
  Startup  probe: GET /healthz/ready  (max wait 60s)
  Liveness probe: GET /healthz/live   (every 30s, fails after 3 misses -> restart)

Service is healthy and serving traffic.

For ADK agents that connect to Firestore or external APIs during startup, add those connectivity checks to the /healthz/ready endpoint so traffic is only routed to fully initialised instances. Set the liveness probe period to 30 seconds and failure threshold to 3 so that a temporarily slow Gemini API response does not incorrectly trigger a restart. Monitor probe failure counts in Cloud Monitoring to detect recurring startup issues early.

Send your comments, suggestions or queries regarding this site to [email protected].