In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Agentic AI > LangGraph > Deploying LangGraph Agents to Production

Deploying LangGraph Agents to Production

Author: Venkata Sudhakar

Deploying a LangGraph agent to production means wrapping it in a web server so other services and UIs can call it via HTTP. The most common pattern is to expose the agent as a FastAPI endpoint - receive a request, run the graph with app.invoke() or app.astream(), and return the result. For ShopMax India, the customer support agent becomes a REST service that the mobile app and website call whenever a user opens a chat.

For persistence across requests, attach a SqliteSaver or PostgresSaver checkpointer so conversation threads survive server restarts. Pass a unique thread_id per user session in the config to maintain separate conversation histories. Use AsyncSqliteSaver with async FastAPI routes to avoid blocking the event loop under concurrent load.

The following example wraps a ShopMax India LangGraph agent in a FastAPI application with persistent SQLite checkpointing, ready to run with uvicorn.

from fastapi import FastAPI
from pydantic import BaseModel
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.sqlite import SqliteSaver
from typing import TypedDict

class ChatState(TypedDict):
    messages: list
    response: str

def support_node(state: ChatState) -> ChatState:
    last_msg = state["messages"][-1] if state["messages"] else ""
    if "order" in last_msg.lower():
        reply = "I can help with your ShopMax India order. Please share your order ID."
    else:
        reply = "Welcome to ShopMax India support! How can I assist you today?"
    return {"response": reply, "messages": state["messages"] + [reply]}

graph = StateGraph(ChatState)
graph.add_node("support", support_node)
graph.set_entry_point("support")
graph.add_edge("support", END)

checkpointer = SqliteSaver.from_conn_string("shopmax_agent.db")
agent = graph.compile(checkpointer=checkpointer)

app = FastAPI(title="ShopMax India Support Agent")

class ChatRequest(BaseModel):
    thread_id: str
    message: str

@app.post("/chat")
def chat(request: ChatRequest):
    config = {"configurable": {"thread_id": request.thread_id}}
    state = {"messages": [request.message], "response": ""}
    result = agent.invoke(state, config=config)
    return {"thread_id": request.thread_id, "response": result["response"]}

# Run with: uvicorn main:app --host 0.0.0.0 --port 8000

It gives the following output,

# POST /chat
# Request: {"thread_id": "user-C001", "message": "I have a question about my order"}
# Response:
{
  "thread_id": "user-C001",
  "response": "I can help with your ShopMax India order. Please share your order ID."
}

For high-traffic production deployments at ShopMax India, replace SQLite with PostgresSaver for concurrent writes and use a process manager like gunicorn with multiple uvicorn workers. Add a Redis layer for caching frequent responses. Monitor agent latency and error rates using standard FastAPI middleware, and set recursion_limit in the graph config to guard against runaway loops.

Send your comments, suggestions or queries regarding this site to [email protected].