When AI pessimists talk about the apocalypse and AI taking over, they often overlook that even the most advanced language models struggle with reasoning and drawing conclusions based on intricate connections. Another problem is the prohibitively high costs of training or fine-tuning (adapting to your data) large language models (LLMs).
Graph RAG (retrieval-augmented generation) approach addresses both issues and represents an upgrade to the original RAG technique we previously discussed. Let’s explore those graphs!
In today’s episode, we will cover:
Original RAG - revisiting the basics
What are limitations of original RAG?
Here comes Graph RAG approach
What Graph RAG is especially good at?
Clarification of terms: “Graph RAG" vs "Knowledge Graph RAG"
Bonus: Resources
Revisiting original RAG
Let’s briefly revisit the key concepts behind RAG. This approach allows using LLMs on previously unseen data without the need for fine-tuning. In the RAG setup, the data is stored in vector form within an external database. Using RAG, the LLM retrieves necessary information from it and bases its answer to the user query on retrieved facts.

Source: RAG original paper
RAG conserves resources by avoiding continuous fine-tuning as data updates, while also enabling easy modification of external databases for dynamic data control.
Limitations of original RAG
Original RAG approaches use vector similarity as the search technique. It’s recognized as a powerful technology, transforming how we access information and a valuable update to traditional search engines. However, it has limitations, which become apparent when we understand the nature of vector similarity.
You probably remember the concept of vector or word embeddings from our previous explorations into transformers. These embeddings are dense vector representations of words that capture semantic and syntactic relationships. They enable language models to learn the relationships between words by quantifying the semantic similarity when they are represented as points in vector spaces.
Vector similarity is a metric used to measure relationships, computed using methods like Euclidean distance, cosine similarity, and dot product similarity, each with its own pros and cons. However, vector similarity only finds answers based on their resemblance to the user query. This approach falls short in more sophisticated systems where we need to combine various pieces of information or where the answer isn't clearly present in a single document, causing limitations in the original RAG model.
As the authors of Graph RAG approach write: “RAG fails on global questions directed at an entire text corpus, such as “What are the main themes in the dataset?”, since this is inherently a query-focused summarization (QFS) task, rather than an explicit retrieval task.” At the same time, prior QFS methods fail to scale to the quantities of text indexed by typical RAG systems. Graph RAG approach comes as an effective solution to more complex queries.
How Graph RAG Works
Graph RAG approach was presented by Microsoft researchers in April, 2024. Unlike original RAG, this approach means to organize data into a graph structure, representing text data and its interrelations.
Here’s how Graph RAG works:
Source Documents → Text Chunks: The raw text from the external database is first segmented into smaller, manageable chunks.
Text Chunks → Element Instances: Using an LLM and a prompt tailored to the database’s domain, Graph RAG identifies and extracts entities like people, places, organizations, and relationships from each text chunk.
Element Instances → Element Summaries: Another set of LLM prompts is used to generate shot descriptions of each entity and relationship to summarize the initial, raw text data.
Element Summaries → Graph Communities: The summarized entities and relationships are used to build a knowledge graph where nodes represent entities, and edges represent relationships. Community detection algorithms, like the Leiden algorithm, are then applied to this graph to identify communities of closely related nodes.
Graph Communities → Community Summaries: Each detected community is then independently summarized to produce a comprehensive overview of the themes and information it represents.
Community Summaries → Community Answers → Global Answer: When a user submits a query:
The relevant community summaries are first identified based on their content and relation to the query.
Intermediate answers are generated independently for each relevant community summary using the LLM.
These intermediate answers are then compiled, evaluated for their relevance and helpfulness (sometimes scored by the LLM), and synthesized into a final, cohesive global answer returned to the user.
With graph-based indexing and community-focused summarization, Graph RAG is a valuable addition to RAG systems to handle query-focused summarization at scale.
Graph RAG Use Cases
The main benefits of using Graph RAG include:
Enhanced relevance: Graph RAG identifies the most relevant information clusters connected to the user query by structuring data as a knowledge graph.
Efficiency: Graph RAG searches relevant portions of the data based on the graph structure. This reduces the computational workload compared to processing the entire dataset for every query.
Comprehensive responses: The system can synthesize information from multiple documents. This allows the creation of more well-rounded and contextually rich answers that are more informative than responses derived from a single document.
Scalability: It can efficiently handle large datasets by leveraging graph-based structures, making it scalable and effective for extensive information repositories.
Dynamic Learning: As more data is added or updated in the graph, Graph RAG can adapt and refine its responses, making it suitable for dynamic and evolving datasets.
Graph RAG vs Knowledge Graph RAG
The terms "Graph RAG" and "Knowledge Graph RAG" are often used interchangeably, as they both refer to a retrieval augmented generation (RAG) approach that leverages knowledge graphs to enhance the accuracy and relevance of AI responses. In practice, most RAG systems that leverage graph structures for retrieval are essentially using some form of a knowledge graph, even if it's not explicitly labeled as such. The key distinction lies in the level of sophistication and formality of the graph representation being used.
Bonus: Resources
Original resources
Graph RAG was published in April 2024 and is promised to be released soon as an open-source project on the official website and on GitHub.
Implementations
While the original implementation is still not available, there are some tutorials on how to use the idea of knowledge graphs with LLMs:
Course: Knowledge Graphs for RAG
Turing Post
How did you like it?
Thank you for reading! Share this article with three friends and get a 1-month subscription free! 🤍










