In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > OpenAI API > OpenAI Responses API - Stateful Conversations with Built-in Tools

OpenAI Responses API - Stateful Conversations with Built-in Tools

Author: Venkata Sudhakar

The OpenAI Responses API is the modern replacement for the Chat Completions API, offering stateful multi-turn conversations with built-in tools like file search, web search, and code interpreter without managing conversation history manually. ShopMax India uses the Responses API to build a product advisor that can search internal documents, run price calculations, and maintain context across a customer session in a single unified API call.

The key difference from Chat Completions is that the Responses API stores conversation state server-side using a previous_response_id parameter, so you do not need to resend the full message history on every turn. Built-in tools are enabled by passing a tools array with objects like {type: file_search, vector_store_ids: [...]} or {type: web_search_preview}. The response object includes output items for text, tool calls, and tool results in sequence.

The below example shows ShopMax India building a two-turn stateful conversation where a customer first asks about a product and then asks a follow-up without resending context.

It gives the following output,

Turn 1: For college use under Rs 60,000 in Bangalore, I recommend the ASUS VivoBook 15 (AMD Ryzen 5, 8GB RAM, 512GB SSD) at around Rs 52,000, or the Lenovo IdeaPad Slim 3 (Intel Core i5, 16GB RAM) at Rs 58,000. Both offer good performance for coding, documents, and light multitasking.
Response ID: resp_abc123xyz

Turn 2: Of the two recommendations, the ASUS VivoBook 15 has the better battery life at approximately 8-9 hours on a single charge, making it ideal for a full day of college classes without needing to carry your charger.
Total tokens used: 412

Use previous_response_id to chain turns instead of resending history - this reduces token costs significantly for long conversations. Set store=false on the Responses API call if you do not want the conversation persisted server-side for privacy reasons. The Responses API supports the same model list as Chat Completions, so you can swap in gpt-4o-mini for cost-sensitive workloads. For production, handle response.status values (completed, in_progress, failed) and implement retry logic on failed responses. Combine built-in tools in a single call - pass both file_search and web_search_preview together so the model can choose the right source for each query.

Send your comments, suggestions or queries regarding this site to [email protected].