In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Agentic AI > CrewAI > Deploying CrewAI Agents to Production

Deploying CrewAI Agents to Production

Author: Venkata Sudhakar

Deploying a CrewAI crew to production means making it accessible as a service that other applications can trigger on demand. The standard pattern is to wrap the crew in a FastAPI endpoint - receive a request with the task inputs, run crew.kickoff(inputs=...), and return the result. For ShopMax India, this turns the multi-agent content pipeline into an API that the marketing dashboard can call whenever new product descriptions or reports are needed.

For long-running crews, use kickoff_async() with an async FastAPI route to avoid blocking the server while the crew processes. Return a job ID immediately, then let the caller poll a status endpoint. Use CrewAI Flow for complex orchestration with state management across multiple crew runs. Containerize the service with Docker and deploy to any cloud platform that supports Python containers.

The following example wraps a ShopMax India product description crew in a FastAPI service with both synchronous and asynchronous endpoints.

import asyncio
from fastapi import FastAPI, BackgroundTasks
from pydantic import BaseModel
from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", api_key="your-api-key")

def build_crew(product_name: str, city: str) -> Crew:
    writer = Agent(
        role="Product Copywriter",
        goal="Write compelling product descriptions for ShopMax India",
        backstory="Expert in e-commerce copy tailored for Indian customers",
        llm=llm, verbose=False
    )
    task = Task(
        description=f"Write a 3-sentence product description for {product_name} targeted at customers in {city}.",
        expected_output="A 3-sentence product description.",
        agent=writer
    )
    return Crew(agents=[writer], tasks=[task], process=Process.sequential, verbose=False)

app = FastAPI(title="ShopMax India Content API")

class ContentRequest(BaseModel):
    product_name: str
    city: str

@app.post("/generate")
def generate_sync(req: ContentRequest):
    crew = build_crew(req.product_name, req.city)
    result = crew.kickoff()
    return {"product": req.product_name, "city": req.city, "description": result.raw}

@app.post("/generate-async")
async def generate_async(req: ContentRequest):
    crew = build_crew(req.product_name, req.city)
    result = await crew.kickoff_async()
    return {"product": req.product_name, "city": req.city, "description": result.raw}

# Run with: uvicorn main:app --host 0.0.0.0 --port 8000

It gives the following output,

# POST /generate
# Request: {"product_name": "Samsung 65-inch QLED TV", "city": "Delhi"}
# Response:
{
  "product": "Samsung 65-inch QLED TV",
  "city": "Delhi",
  "description": "The Samsung 65-inch QLED TV brings cinema-quality colour to your Delhi home with Quantum Dot technology and a 120Hz refresh rate. Its built-in SmartThings hub connects seamlessly with other devices, making it ideal for modern Indian households. Available at ShopMax India with no-cost EMI and same-day delivery in Delhi."
}

For production at ShopMax India, add request authentication with API keys, rate limiting per client, and structured logging so you can trace each crew run. Store crew outputs in a database keyed by request ID so you can replay or audit any generation. Package the FastAPI app in a Docker container and deploy behind a load balancer to handle parallel requests from multiple teams running the content pipeline simultaneously.

Send your comments, suggestions or queries regarding this site to [email protected].