|
|
Building a Local RAG System with Ollama in Python
Author: Venkata Sudhakar
Retrieval Augmented Generation (RAG) combines a retrieval step with LLM generation to answer questions grounded in a specific document set. Instead of relying on the model knowledge alone, RAG first retrieves the most relevant passages from a document store and then passes them as context to the LLM. With Ollama, the entire pipeline can run locally - no cloud API, no data leaving the machine. At ShopMax India, a local RAG system allows staff to query internal policy documents, product manuals, and pricing sheets using natural language. A minimal RAG pipeline needs three components: a document store (a list of text chunks), an embedding model to vectorize documents and queries, and a generative model to produce the final answer. Ollama provides both the embedding model (nomic-embed-text) and the generative model (llama3.2) locally, making it a self-contained solution. The below example shows how to build a simple local RAG system using Ollama for ShopMax India internal policy queries.
It gives the following output,
Indexing documents...
Q: What is the return policy for electronics?
A: ShopMax India offers a 7-day return policy for all electronics
sold in stores.
Q: How much travel allowance do Mumbai employees get?
A: Employees in Mumbai and Bangalore are eligible for a Rs 5,000
monthly travel allowance.
Q: When are performance reviews held?
A: Annual performance reviews are conducted every March for all
full-time employees.
This minimal RAG pattern is effective for small document sets. For larger document collections at ShopMax India, replace the in-memory cosine search with a vector database like ChromaDB or Qdrant running locally. Chunk long documents into 200 to 400 word segments before embedding to ensure each chunk covers a single focused topic, which improves retrieval precision. The combination of nomic-embed-text and llama3.2 via Ollama provides a completely offline, cost-free RAG solution suitable for internal business tools.
|
|