In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > OpenAI API > OpenAI Parallel Tool Calling - Running Multiple Tools Simultaneously

OpenAI Parallel Tool Calling - Running Multiple Tools Simultaneously

Author: Venkata Sudhakar

OpenAI parallel tool calling allows GPT models to invoke multiple tools simultaneously in a single response when the queries are independent, rather than making them sequentially. This cuts latency significantly for workflows that require data from several sources at once. ShopMax India uses parallel tool calling in its order processing agent to check inventory availability, validate payment status, and calculate shipping cost for an order all at the same time before confirming the purchase.

Parallel tool calling is enabled by default when you pass a tools list to client.chat.completions.create(). When the model decides multiple tools can be called simultaneously, it returns a single assistant message with multiple tool_calls entries, each with a unique id. Your code executes all the tool calls, then sends back all results in a single follow-up message array - one tool role message per tool_call_id. The model then synthesises all results into a final response.

The below example shows ShopMax India processing an order by checking inventory, payment, and shipping in parallel using three custom tools called simultaneously by GPT-4o.

import openai
import json

client = openai.OpenAI(api_key="your-openai-api-key")

# Define three tools for order processing
tools = [
    {
        "type": "function",
        "function": {
            "name": "check_inventory",
            "description": "Check if a product is in stock at a warehouse",
            "parameters": {
                "type": "object",
                "properties": {
                    "product_id": {"type": "string"},
                    "warehouse": {"type": "string"},
                    "quantity": {"type": "integer"}
                },
                "required": ["product_id", "warehouse", "quantity"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "validate_payment",
            "description": "Validate a payment transaction",
            "parameters": {
                "type": "object",
                "properties": {
                    "transaction_id": {"type": "string"},
                    "amount": {"type": "number"}
                },
                "required": ["transaction_id", "amount"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_shipping_cost",
            "description": "Calculate shipping cost to a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "warehouse": {"type": "string"},
                    "destination": {"type": "string"},
                    "weight_kg": {"type": "number"}
                },
                "required": ["warehouse", "destination", "weight_kg"]
            }
        }
    }
]

# Simulated tool execution functions
def check_inventory(product_id, warehouse, quantity):
    return {"available": True, "stock": 23, "warehouse": warehouse}

def validate_payment(transaction_id, amount):
    return {"valid": True, "transaction_id": transaction_id, "amount": amount}

def get_shipping_cost(warehouse, destination, weight_kg):
    return {"cost": 99, "days": 2, "courier": "Delhivery"}

messages = [{
    "role": "user",
    "content": "Process order ORD-MUM-4491: 1x OnePlus 13 (SKU-OP13) from Mumbai warehouse, payment TXN-88821 for Rs 69999, shipping to Bangalore customer, product weight 0.5kg."
}]

# First call: model requests multiple tools in parallel
response = client.chat.completions.create(
    model="gpt-4o",
    tools=tools,
    messages=messages
)

tool_calls = response.choices[0].message.tool_calls
print(f"Model requested {len(tool_calls)} tool calls in parallel:")
for tc in tool_calls:
    print(f"  - {tc.function.name}({tc.function.arguments[:60]}...)")

# Execute all tool calls and collect results
messages.append(response.choices[0].message)
tool_map = {
    "check_inventory": check_inventory,
    "validate_payment": validate_payment,
    "get_shipping_cost": get_shipping_cost
}

for tc in tool_calls:
    args = json.loads(tc.function.arguments)
    result = tool_map[tc.function.name](**args)
    messages.append({"role": "tool", "tool_call_id": tc.id, "content": json.dumps(result)})

# Second call: model synthesises all results
final = client.chat.completions.create(model="gpt-4o", tools=tools, messages=messages)
print("\nFinal response:", final.choices[0].message.content)

It gives the following output,

Model requested 3 tool calls in parallel:
  - check_inventory({"product_id": "SKU-OP13", "warehouse": "Mumbai",...)
  - validate_payment({"transaction_id": "TXN-88821", "amount": 69999...)
  - get_shipping_cost({"warehouse": "Mumbai", "destination": "Bangalo...)

Final response: Order ORD-MUM-4491 is confirmed. Inventory check passed (23 units available in Mumbai), payment TXN-88821 for Rs 69,999 is valid, and shipping to Bangalore will cost Rs 99 via Delhivery with delivery in 2 business days.

Parallel tool calling reduces latency proportionally to the number of independent calls - three 200ms database queries run in 200ms total instead of 600ms. Set parallel_tool_calls=False in the API call to force sequential tool calls when order matters (e.g. confirm payment before reserving inventory). Always match tool results back to their tool_call_id precisely - mismatched IDs cause the model to produce incorrect synthesis. For tools that have side effects (sending an SMS, charging a card), consider running them sequentially despite the latency cost to preserve rollback options. Log all tool call arguments and results for audit trails in financial workflows.

Send your comments, suggestions or queries regarding this site to [email protected].