In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Google Gemini API > DevOps Pipeline Agent

DevOps Pipeline Agent

Author: Venkata Sudhakar

DevOps teams spend significant time diagnosing pipeline failures, parsing cryptic build logs, and deciding whether to retry, fix, or escalate. A pipeline agent can watch for failures, analyse logs using Gemini, identify the root cause, and take automated remediation steps - reducing mean time to recovery dramatically.

In this tutorial, we build a ShopMax India CI/CD pipeline agent that analyses a failed GitHub Actions log, classifies the failure type, suggests the fix, and decides whether to auto-retry or page the on-call engineer.

The below example shows the agent processing a failed build log and producing a structured diagnosis with remediation steps.

import os
from google.adk.agents import LlmAgent
from google.adk.tools import FunctionTool
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
from google.genai import types

# Simulated CI failure log
FAILED_LOG = """
Run pytest tests/ --cov=app --cov-report=xml
FAILED tests/test_orders.py::test_create_order_razorpay - ConnectionError:
HTTPSConnectionPool(host='api.razorpay.com', port=443): Max retries exceeded
with url: /v1/payment_links (Caused by NewConnectionError(
<urllib3.connection.HTTPSConnection>: Failed to establish a new connection:
[Errno -2] Name or service not known))
FAILED tests/test_inventory.py::test_stock_deduction - AssertionError:
assert 14 == 13  # Stock not deducted after order
2 failed, 47 passed in 34.21s
"""

def classify_failure(log: str) -> dict:
    """Classify a CI build failure into categories and recommend action."""
    categories = {
        "network": ["ConnectionError", "HTTPSConnectionPool", "Name or service not known", "timeout"],
        "assertion": ["AssertionError", "assert", "Expected", "!="],
        "import": ["ImportError", "ModuleNotFoundError", "No module named"],
        "syntax": ["SyntaxError", "IndentationError"]
    }
    found = []
    for cat, keywords in categories.items():
        if any(k in log for k in keywords):
            found.append(cat)
    auto_retry = "network" in found and "assertion" not in found
    return {
        "failure_types": found,
        "auto_retry_safe": auto_retry,
        "action": "Auto-retry" if auto_retry else "Requires developer fix"
    }

def send_alert(channel: str, message: str) -> dict:
    """Send an alert to the engineering team channel."""
    print(f"[ALERT -> {channel}]: {message}")
    return {"sent": True, "channel": channel}

pipeline_agent = LlmAgent(
    name="devops_pipeline_agent",
    model="gemini-2.0-flash",
    instruction="""You are the ShopMax India DevOps pipeline agent.
When given a CI failure log:
1. Use classify_failure to categorise the failures.
2. Explain each failure in plain English with the likely root cause.
3. Recommend the exact fix for each failure.
4. If auto_retry_safe, recommend retry. Otherwise use send_alert to page the team.
Be concise and actionable.""",
    tools=[FunctionTool(classify_failure), FunctionTool(send_alert)]
)

Now run the agent with the failed log,

It gives the following output,

[ALERT -> #engineering-oncall]: ShopMax CI failure requires fix.

CI Pipeline Diagnosis - ShopMax India

Failure 1: NETWORK + ASSERTION failures detected
Action: Requires developer fix (assertion failure present - retry will not help)

--- Failure Details ---

1. test_create_order_razorpay - ConnectionError (Network)
   Root cause: CI runner could not reach api.razorpay.com.
   Fix: This test requires network access to Razorpay sandbox.
   Either mock the Razorpay client in tests or allowlist api.razorpay.com
   in the CI network policy. Recommended: use pytest-mock to stub the client.

2. test_stock_deduction - AssertionError (Logic Bug)
   Root cause: Stock count is 14 instead of expected 13 after order creation.
   Fix: The deduct_stock() function is not being called after order commit.
   Check the order creation transaction - ensure stock deduction runs
   within the same DB transaction as the order insert.

Alert sent to #engineering-oncall.

Connect this agent to your GitHub Actions webhook or GitLab pipeline events to trigger automatic analysis on every failure. For production pipelines, add tools for git blame (to find who last touched the failing file), JIRA ticket creation, and Slack notifications. Network-only failures can be auto-retried; logic failures always need a developer.

Send your comments, suggestions or queries regarding this site to [email protected].