|
|
Production RAG with ChromaDB
Author: Venkata Sudhakar
Retrieval-Augmented Generation (RAG) solves the problem of asking an LLM questions about your own private documents. The LLM does not know your HR policies, product manuals, or internal procedures. With RAG you chunk those documents into pieces, embed them as vectors in a database, and at query time retrieve the most relevant chunks to include in the prompt. The LLM then answers from the retrieved chunks rather than from general training knowledge - producing accurate, grounded answers about your own content without fine-tuning any model. ChromaDB is a lightweight open-source vector database that runs locally or as a server. The RAG pipeline has two phases: indexing (split documents into chunks, embed each chunk, store in ChromaDB with metadata) and querying (embed the question, retrieve the most similar chunks, pass them to the LLM). ChromaDB persists to disk between sessions so you index once and query many times. For most company knowledge bases under 10,000 documents, local ChromaDB is sufficient and costs nothing extra beyond the embedding API calls. The below example builds a complete HR policy Q&A chatbot - indexing five HR policy sections, then answering employee questions with answers grounded in actual policy text and citing the source document.
Querying the RAG pipeline with real employee questions,
It gives the following output,
Indexed 14 chunks into ChromaDB - ready to query
Employee: Can I carry forward unused casual leave to next year?
HR Bot: No, Casual Leave cannot be carried forward to the next year. According
to the Leave Policy 2025, Casual Leave is non-carry forward and also
non-encashable, unlike Earned Leave which can be carried forward up
to a maximum of 30 days.
Employee: What is the hotel limit for a business trip to Bangalore?
HR Bot: Bangalore is a metro city, so the hotel reimbursement limit is
Rs 5,000 per night, as stated in the Expense Reimbursement Policy.
Remember to book through the corporate travel portal and submit
your claim within 30 days with original receipts.
Employee: I am in my 3rd month of probation. Can I work from home?
HR Bot: No, work from home is not permitted during the probation period
(first 6 months), as per the Work From Home Policy 2025. You
will be eligible to apply for WFH days after completing 6 months.
Employee: What increment can I expect with a Meets Expectations rating?
HR Bot: According to the Performance Review Policy, a Meets Expectations
(ME) rating entitles you to an increment of 8-12%. The exact
percentage within this band depends on your department budget
and manager recommendation.
# All answers grounded in retrieved policy text - no hallucination
# Source documents cited in each answer for employee trust
# ChromaDB persisted to disk - index once, query forever
Production RAG checklist: keep chunk size between 200-400 tokens with 10-15% overlap for best retrieval quality. Store meaningful metadata (source document, section, date) with each chunk so answers can cite sources. Test retrieval quality by checking if the right chunks are returned for known questions before deploying. Refresh the index whenever source documents are updated - use a document hash to detect changes. For a company knowledge base of 50-100 policy documents, ChromaDB handles queries in under 200ms. Consider adding a reranking step (cross-encoder) after retrieval for higher-stakes use cases where retrieval precision is critical.
|
|