|
|
GraphRAG - Knowledge Graph Enhanced Retrieval
Author: Venkata Sudhakar
ShopMax India's support team needs to answer complex questions that span multiple documents - like which suppliers provide products with the most warranty claims in Mumbai. Standard vector RAG retrieves isolated chunks but misses relationships between entities. GraphRAG builds a knowledge graph of entities and relationships from documents, enabling multi-hop reasoning across connected information.
GraphRAG works in two phases. First, an LLM extracts entities (products, suppliers, cities, claims) and their relationships from documents and stores them as nodes and edges. Second, at query time, it traverses the graph to gather connected context, then passes the enriched context to the LLM for a final answer. Microsoft's GraphRAG library automates both phases using a local document folder and a settings YAML file.
The below example shows how ShopMax India builds a GraphRAG pipeline using the Microsoft GraphRAG library to query product warranty and supplier relationship data.
It gives the following output,
Graph index built successfully
Televisions and laptops account for 68% of warranty claims in Mumbai,
primarily linked to transit damage and display defects. Samsung and LG
televisions show the highest claim rate at 12% of units sold, with most
claims filed within 30 days of purchase. Suppliers Harshad Electronics
and Mehta Distributors are associated with the highest defect rates
in the Mumbai region.
Sources used: 47
GraphRAG requires significant upfront indexing cost since it uses an LLM to extract entities from every document chunk. For ShopMax India, run indexing as an overnight batch job and rebuild only when documents change significantly. Use global_search for cross-document questions and local_search for document-specific queries - local search is faster and cheaper. Start with community_level=1 for broad summaries and increase to 2 or 3 for more detailed graph traversal.
|
|