|
|
OpenAI Responses API - Stateful Conversations with Built-in Tools
Author: Venkata Sudhakar
The OpenAI Responses API is the modern replacement for the Chat Completions API, offering stateful multi-turn conversations with built-in tools like file search, web search, and code interpreter without managing conversation history manually. ShopMax India uses the Responses API to build a product advisor that can search internal documents, run price calculations, and maintain context across a customer session in a single unified API call.
The key difference from Chat Completions is that the Responses API stores conversation state server-side using a previous_response_id parameter, so you do not need to resend the full message history on every turn. Built-in tools are enabled by passing a tools array with objects like {type: file_search, vector_store_ids: [...]} or {type: web_search_preview}. The response object includes output items for text, tool calls, and tool results in sequence.
The below example shows ShopMax India building a two-turn stateful conversation where a customer first asks about a product and then asks a follow-up without resending context.
It gives the following output,
Turn 1: For college use under Rs 60,000 in Bangalore, I recommend the ASUS VivoBook 15 (AMD Ryzen 5, 8GB RAM, 512GB SSD) at around Rs 52,000, or the Lenovo IdeaPad Slim 3 (Intel Core i5, 16GB RAM) at Rs 58,000. Both offer good performance for coding, documents, and light multitasking.
Response ID: resp_abc123xyz
Turn 2: Of the two recommendations, the ASUS VivoBook 15 has the better battery life at approximately 8-9 hours on a single charge, making it ideal for a full day of college classes without needing to carry your charger.
Total tokens used: 412
Use previous_response_id to chain turns instead of resending history - this reduces token costs significantly for long conversations. Set store=false on the Responses API call if you do not want the conversation persisted server-side for privacy reasons. The Responses API supports the same model list as Chat Completions, so you can swap in gpt-4o-mini for cost-sensitive workloads. For production, handle response.status values (completed, in_progress, failed) and implement retry logic on failed responses. Combine built-in tools in a single call - pass both file_search and web_search_preview together so the model can choose the right source for each query.
|
|