tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Ollama > Ollama with OpenAI-Compatible API - Drop-in LLM Replacement

Ollama with OpenAI-Compatible API - Drop-in LLM Replacement

Author: Venkata Sudhakar

ShopMax India runs Python services that use the OpenAI SDK for product description generation and customer query handling. Ollama provides an OpenAI-compatible API endpoint, allowing you to switch from cloud models to local models by changing just the base URL - no code rewrite needed.

When Ollama is running locally, it exposes an endpoint at http://localhost:11434/v1 that mirrors the OpenAI chat completions API. You pass this base URL to the OpenAI Python client with any non-empty string as the API key. All existing OpenAI SDK code then routes to your local Ollama models instead of the cloud.

The below example shows how ShopMax India migrates a product description service from OpenAI to a local Ollama model using the OpenAI-compatible endpoint.


It gives the following output,

The Samsung Galaxy Tab S9 features an 11-inch AMOLED display with vibrant
colours, 256GB of storage, and 5G connectivity ideal for professionals
across Mumbai and Bangalore. Available at ShopMax India for Rs 72000,
it delivers premium performance for work and entertainment.

This approach works for streaming, embeddings, and function calling too - Ollama implements all major OpenAI API endpoints. Verify model availability with "ollama list" before deployment. For ShopMax India production, set OLLAMA_HOST to a dedicated server IP so all services share a single inference node rather than each running their own instance.


 
  


  
bl  br