Global search

Global search generates answers by searching over all AI-generated community reports in a map-reduce fashion. This method excels at questions requiring an understanding of the dataset as a whole.

Overview

Baseline RAG struggles with queries that require aggregation of information across the dataset. Queries such as “What are the top 5 themes in the data?” perform poorly because baseline RAG relies on vector search of semantically similar text content, with nothing in the query to direct it to the correct information.

GraphRAG’s global search solves this by leveraging the structure of the LLM-generated knowledge graph, which reveals the dataset’s structure and themes. The dataset is organized into meaningful semantic clusters that are pre-summarized.

How it works

Global search operates in two stages: map and reduce.

Map stage

Given a user query and optional conversation history, global search uses LLM-generated community reports from a specified level of the graph’s community hierarchy as context data.

Segmentation: Community reports are divided into text chunks of pre-defined size
Shuffling: Reports are randomly shuffled and distributed across batches
Parallel processing: Each batch generates an intermediate response
Rating: Each point in the intermediate responses receives a numerical importance rating

Reduce stage

The reduce stage aggregates and refines the intermediate responses:

Filtering: Points are filtered based on importance scores (score > 0)
Ranking: Remaining points are sorted by descending importance
Selection: Top-ranked points are selected within the token budget
Aggregation: Selected points are combined to generate the final response

Configuration

The GlobalSearch class accepts the following key parameters:

model

LLMCompletion

required

Language model chat completion object for response generation

context_builder

GlobalContextBuilder

required

Context builder object for preparing context data from community reports

map_system_prompt

str

Prompt template for the map stage. Default: MAP_SYSTEM_PROMPT

reduce_system_prompt

str

Prompt template for the reduce stage. Default: REDUCE_SYSTEM_PROMPT

response_type

str

default:"multiple paragraphs"

Free-form text describing the desired response format (e.g., “Multiple Paragraphs”, “Multi-Page Report”)

allow_general_knowledge

bool

default:"false"

If true, prompts the LLM to incorporate relevant real-world knowledge outside the dataset. May increase hallucinations but useful for certain scenarios.

general_knowledge_inclusion_prompt

str

Instruction added to the reduce prompt when allow_general_knowledge is enabled. Default: GENERAL_KNOWLEDGE_INSTRUCTION

max_data_tokens

int

default:"8000"

Token budget for context data

map_llm_params

dict

Additional parameters (e.g., temperature, max_tokens) for the LLM call at the map stage

reduce_llm_params

dict

Additional parameters (e.g., temperature, max_tokens) for the LLM call at the reduce stage

context_builder_params

dict

Additional parameters passed to the context builder when building the context window for the map stage

concurrent_coroutines

int

default:"32"

Controls the degree of parallelism in the map stage

callbacks

list[QueryCallbacks]

Optional callback functions for custom event handlers during LLM completion streaming

API usage

Basic usage

from graphrag.api import global_search
from graphrag.config import GraphRagConfig
import pandas as pd

# Load your configuration
config = GraphRagConfig.from_file("settings.yaml")

# Load your indexed data
entities = pd.read_parquet("output/entities.parquet")
communities = pd.read_parquet("output/communities.parquet")
community_reports = pd.read_parquet("output/community_reports.parquet")

# Perform global search
response, context = await global_search(
    config=config,
    entities=entities,
    communities=communities,
    community_reports=community_reports,
    community_level=2,  # Level in community hierarchy
    dynamic_community_selection=False,
    response_type="Multiple Paragraphs",
    query="What are the main themes in this dataset?"
)

print(response)

Streaming usage

from graphrag.api import global_search_streaming

# Stream the response
async for chunk in global_search_streaming(
    config=config,
    entities=entities,
    communities=communities,
    community_reports=community_reports,
    community_level=2,
    dynamic_community_selection=False,
    response_type="Multiple Paragraphs",
    query="What are the main themes in this dataset?"
):
    print(chunk, end="", flush=True)

Dynamic community selection

# Use dynamic community selection with a max level cap
response, context = await global_search(
    config=config,
    entities=entities,
    communities=communities,
    community_reports=community_reports,
    community_level=3,  # Max level cap
    dynamic_community_selection=True,  # Enable dynamic selection
    response_type="Multi-Page Report",
    query="Provide a comprehensive analysis of the dataset"
)

Performance considerations

Global search is resource-intensive. The quality and cost of responses are heavily influenced by the community hierarchy level.

Community hierarchy level

The community_level parameter significantly impacts search performance:

Lower levels (closer to leaf nodes): More detailed reports, higher quality responses, but increased time and LLM resource usage
Higher levels (closer to root): Broader summaries, faster responses, but potentially less detailed

Token budget optimization

Adjust max_data_tokens to balance quality and cost:

# Higher token budget for comprehensive answers
config.max_data_tokens = 12000  # More context, higher cost

# Lower token budget for faster, cheaper searches
config.max_data_tokens = 5000   # Less context, lower cost

Parallelism tuning

Control parallel processing with concurrent_coroutines:

# Increase for faster processing (if API rate limits allow)
search = GlobalSearch(
    # ... other params
    concurrent_coroutines=64  # Double default
)

# Decrease to avoid rate limits
search = GlobalSearch(
    # ... other params
    concurrent_coroutines=16  # Half default
)

Best practices

Choose the right community level

Start with level 2 and adjust based on your dataset size and query complexity

Use dynamic community selection for complex queries

Enable dynamic_community_selection=True for queries requiring variable depth

Customize response types

Specify clear response formats: “Single Paragraph”, “Multiple Paragraphs”, “Multi-Page Report”, “List of 5-10 Items”

Monitor token usage

Track prompt_tokens and output_tokens in the response to optimize costs

Examples

Thematic analysis

response, context = await global_search(
    config=config,
    entities=entities,
    communities=communities,
    community_reports=community_reports,
    community_level=2,
    dynamic_community_selection=False,
    response_type="List of 10 Items",
    query="What are the top 10 themes discussed in this dataset?"
)

Comprehensive report generation

response, context = await global_search(
    config=config,
    entities=entities,
    communities=communities,
    community_reports=community_reports,
    community_level=1,
    dynamic_community_selection=True,
    response_type="Multi-Page Report",
    query="Generate a comprehensive report on the key findings in this dataset",
    callbacks=[custom_callback]  # Track progress
)

Next steps

Local search

Learn about entity-based search

DRIFT search

Explore hybrid search methods

Example notebooks

See global search in action

Configuration

Configure global search settings

Get Started

Core Concepts

Indexing

Query Engine

Prompt Tuning

Configuration

Guides

Overview

How it works

Map stage

Reduce stage

Configuration

API usage

Basic usage

Streaming usage

Dynamic community selection

Performance considerations

Community hierarchy level

Token budget optimization

Parallelism tuning

Best practices

Examples

Thematic analysis

Comprehensive report generation

Next steps

Local search

DRIFT search

Example notebooks

Configuration

Build docs developers (and LLMs) love

Get Started

Core Concepts

Indexing

Query Engine

Prompt Tuning

Configuration

Guides

​Overview

​How it works

​Map stage

​Reduce stage

​Configuration

​API usage

​Basic usage

​Streaming usage

​Dynamic community selection

​Performance considerations

​Community hierarchy level

​Token budget optimization

​Parallelism tuning

​Best practices

​Examples

​Thematic analysis

​Comprehensive report generation

​Next steps

Local search

DRIFT search

Example notebooks

Configuration

Build docs developers (and LLMs) love

Overview

How it works

Map stage

Reduce stage

Configuration

API usage

Basic usage

Streaming usage

Dynamic community selection

Performance considerations

Community hierarchy level

Token budget optimization

Parallelism tuning

Best practices

Examples

Thematic analysis

Comprehensive report generation

Next steps