GraphRAG auto-tuning provides rapid adaptation to new domains

已发布

作者 , Sr. Software Engineer , Data Scientist II , Senior Data Scientist , Senior Director , Senior Data Scientist , Senior Program Manager , Senior Director , Principal Program Manager , Senior Principal Data Architect

GraphRAG hero: white circles linked on a blue to green gradient.

GraphRAG uses large language models (LLMs) to create a comprehensive knowledge graph that details entities and their relationships from any collection of text documents. This graph enables GraphRAG to leverage the semantic structure of the data and generate responses to complex queries that require a broad understanding of the entire text. In previous blog posts, we introduced GraphRAG and demonstrated how it could be applied to news articles. In this blog post, we show that it can also be tuned to any domain to enhance the quality of the results.

The knowledge graph creation process is called indexing. An LLM, guided by a set of domain-specific prompts, reads all the source content and extracts the relevant information, including entities and relationships, which are then used to construct the graph. For example, when analyzing news articles, entities like people, places, and organizations are important. Here, relationship types might include “lives in,” “leads,” and “owns.” 

However, each domain has a different set of entity and relationship types. In the field of chemistry, for instance, entity types include molecules, enzymes, and reactions, while relationship types include “catalyzes” and “reduces.” Although our default news domain prompts in GraphRAG can produce a graph when applied to chemistry, they don’t capture the specific content a chemist would expect. 

Manually creating and tuning a set of domain-specific prompts is time-consuming. We know, as all the prompts used for news articles were generated manually. To streamline this process, we developed an automated tool that generates domain-specific prompts, which are tuned and ready to use. This tool follows a human-like approach; we provided an LLM with a sample of text data (e.g., 1% of 10,000 chemistry papers) and instructed it to produce the prompts it deemed most applicable to the content. Now, with these automatically generated and tuned prompts, we can immediately apply GraphRAG to a new domain of our choosing, confident that we’ll get high-quality results.

Indexing prompts in GraphRAG

During the indexing process, GraphRAG uses a set of prompts to instruct the LLM as it reads through the source content, extracting and organizing relevant information to construct the knowledge graph. Three of GraphRAG’s main indexing prompts include: 

  1. Entity and relationship extraction: Identifies all the entities present and establishes relationships among them.
  2. Entity and relationship summarization: Consolidates instances of entities and their relationships into a single, concise description. 
  3. Community report generation: Generates a summary report for each community within the constructed knowledge graph. 

These prompts work best when tuned to the domain of the source content. In the rest of this blog post, we focus on domain tuning of the first prompt, “Entity and relationship extraction,” but similar methods apply to the second and third prompts. 

Below, Code Sample 1 shows the default few-shot prompt for entity and relationship extraction. This prompt was originally created for news articles and is the default form found in the GraphRAG GitHub repository (opens in new tab). The extraction prompt comprises four sections: 

  • Extraction instructions: Provide the LLM with guidance on how to perform extraction. 
  • Few-shot examples: Supply the LLM real examples of the types of entities and relationships worth extracting.
  • Real data: Serves as a placeholder that is replaced by chunks of source content. 
  • Gleanings: Encourage the LLM, over multiple turns, to extract additional information. 

The goal of auto-tuning is to create customized few-shot examples that are appropriate for the given domain. The default prompt, shown in Code Sample 1, provides the LLM with fifteen entity examples and twelve relationship examples, but it is notably restricted to just a few specific entity types: organization, geography, and person. These samples were invented by our team and do not represent real entities.


Customization can be difficult and time-consuming—in both determining the right set of entities and relationships and in carefully constructing all the prompts for a specific domain. We address these challenges with auto-tuning.

Spotlight: Blog post

Eureka: Evaluating and understanding progress in AI

How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings. 

Auto-tuning architecture

Auto-tuning takes source content and produces an automatically generated set of domain-specific prompts. Figure 1 shows the architecture of the auto-tuning process for our three main indexing prompts.

GraphRAG | Figure 1. Algorithm Conceptual Diagram
Figure 1.Diagram of the algorithm

We start by sending a sample of the source content to the LLM, which first identifies the domain and then creates an appropriate persona—used with downstream agents to tune the extraction process. Once the domain and persona are established, several processes occur in parallel to create our custom indexing prompts. This way, the few-shot prompts are generated based on the actual domain data and from the persona’s perspective. 

To illustrate how this works in practice for entity and relationship extraction, let’s shift to a new domain, the Behind the Tech podcast. 

Auto-tuning the Behind the Tech podcast 

Kevin Scott, CTO of Microsoft, hosts a podcast series called Behind the Tech where he interviews a wide variety of tech innovators. Given its focus on society and technology, this dataset would benefit from its own set of indexing prompts distinct from general news. While the default prompt works with podcast transcripts, we can achieve much higher precision with customized podcast-tuned prompts.

To demonstrate this, we use Code Sample 2, which contains a sample raw text input chunk from the podcast. 

Code Sample 2: Podcast data sample


The first step in adapting GraphRAG to the target domain is to generate a persona for the LLM to assume when generating examples for each prompt. As it adapts to the domain from the podcast text sample input, the LLM produces the following: 

“You are an expert in social network analysis with a focus on technology and innovation communities. You are skilled at mapping and interpreting complex networks, identifying key influencers, and understanding the dynamics of community interactions. You are adept at helping organizations and researchers identify the relations and structure within specific domains, particularly in rapidly evolving fields like technology and innovation.” 

Using the persona as part of the prompt, along with the text sample input, we allow the LLM to generate the entity and relationship-extraction prompt, including custom examples. Our indexing prompt is now automatically tuned to our new domain, as shown in Code Sample 3. 


Here, the automatically generated prompt using the sample content from Code Sample 2 identifies fourteen entity examples across six different entity types (person, location, group, concept, field, and geography) and eight relationship examples.

To assess how this impacts the extraction of the entire dataset, we used both the default and the auto-tuned prompt to generate the entity and relationship outputs. Before we explain the results, let’s review the default prompt’s outputs, which produced seven entities and six relationships, as shown in Code Sample 4. 

Code Sample 4: Default extraction output


Using the auto-tuned, domain-specific, automatically generated prompt, we achieved a deeper extraction, producing nine entities and eight relationships, as shown below in Code Sample 5. 

Code Sample 5: Auto-tuned extraction output


Compared with the default prompt, the auto-tuned prompt is an improvement, with more entities and more relationships, providing a more comprehensive view of our data. One key difference between this output and the output from the default prompt is the expansion in entity types being extracted. The default prompt is limited to three example types: organization, geography, and person. However, the auto-tuned prompt expands to more example types derived from the sample input text: organization, person, location, and music genre. 

Putting it all together 

We can observe a clear difference in the final outputs after using these auto-tuned prompts for indexing the podcast source data. To measure this difference, we compared the size of the resulting knowledge graphs using default with auto-tuned prompts. The following results were achieved while keeping all parameters constant between both runs and using GPT4-Turbo: 

 Entities Relationships Communities 
Default prompt 1796 2851 352 
Auto-tuned prompt 4896 8210 1027 

As shown, the use of auto-tuning yields a significantly larger knowledge graph. For example, a prompt that looks for molecules will extract much more from a chemistry dataset than one that looks for people and places. More communities in the knowledge graph means that it can better serve global search queries. While the size of the knowledge graph can be a proxy for its utility, the true measure of utility is observed in end-to-end queries. To demonstrate this, we provide the following side-by-side comparison of a question using the default and auto-tuned prompts.


We asked GraphRAG the following question: “What’s the relationship between Ashley Llorens and Chicago?” 

First, we got the following results using the knowledge graph built with the default prompt. We then used the knowledge graph built with our auto-tuned prompt:

Default query results 

I am sorry but I am unable to answer this question given the provided data.

Auto-tuned query results 

**Origin and Influence**: Ashley Llorens has a significant connection to Chicago, which has deeply influenced his personal and professional life. Growing up in a city known for its vibrant music scene, particularly hip-hop, has shaped his musical pursuits and artistic expression.

“**Professional Impact**: His background and experiences in Chicago have carried over into his professional environment, notably at the Applied Physics Laboratory, where he has integrated elements of hip-hop culture [Data: Reports (940)].

With auto-tuned indexing prompts, our knowledge graph became more representative of the dataset’s entities and relationships, enabling it to yield a valid response to the query. 

Advancing auto-tuning and expanding GraphRag’s capabilities

Looking ahead, we’re exploring other methods to build on this auto-tuning work. We’re excited to investigate new approaches for creating the core GraphRAG knowledge graph and are also studying ways to measure and evaluate the quality of these graph structures. Additionally, we’re researching methods to better assess performance so that we can identify the types of queries where GraphRAG provides unique value. This includes evaluating human-generated versus auto-tuned prompts, as well as exploring potential improvements to the auto-tuner. 

Overall, these new auto-tuner developments make GraphRAG much more accessible and turnkey. We hope this auto-tuning work removes many of the challenges involved when working with new datasets. We invite you to try out these capabilities yourself using GraphRAG’s core library (opens in new tab) and our Azure-based solution accelerator, available on GitHub (opens in new tab).

相关论文与出版物

继续阅读

查看所有博客文章

研究领域

相关工具

相关项目