Indexing methods

GraphRAG is a platform for research into RAG indexing methods that produce optimal context window content for language models. This page documents the available indexing methods.

Standard GraphRAG

This is the method described in the original blog post. Standard uses a language model for all reasoning tasks.

How it works

Entity extraction

LLM is prompted to extract named entities and provide a description from each text unit.

Relationship extraction

LLM is prompted to describe the relationship between each pair of entities in each text unit.

Entity summarization

LLM is prompted to combine the descriptions for every instance of an entity found across text units into a single summary.

Relationship summarization

LLM is prompted to combine the descriptions for every instance of a relationship found across text units into a single summary.

Claim extraction (optional)

LLM is prompted to extract and describe claims from each text unit.

Community report generation

Entity and relationship descriptions (and optionally claims) for each community are collected and used to prompt the LLM to generate a summary report.

Usage

graphrag index --method standard

Since Standard is the default method, the --method parameter can be omitted on the command line.

FastGraphRAG

FastGraphRAG is a hybrid technique that substitutes some of the language model reasoning for traditional natural language processing (NLP) methods. This provides a faster and cheaper indexing alternative.

How it works

Entity extraction

Entities are noun phrases extracted using NLP libraries such as NLTK and spaCy. There is no description; the source text unit is used for this.

Relationship extraction

Relationships are defined as text unit co-occurrence between entity pairs. There is no description.

Entity summarization

Not necessary (skipped).

Relationship summarization

Not necessary (skipped).

Claim extraction

Unused (always skipped).

Community report generation

The direct text unit content containing each entity noun phrase is collected and used to prompt the LLM to generate a summary report.

Usage

graphrag index --method fast

NLP configuration options

FastGraphRAG has several NLP options built in. By default, NLTK + regular expressions are used for noun phrase extraction, which is very fast but primarily suitable for English.

NLTK (default)
spaCy semantic
spaCy CFG

Fast and suitable for English text. Uses regular expressions for noun phrase extraction.

settings.yaml

extract_graph_nlp:
  enabled: true
  # NLTK is used by default

Uses spaCy’s semantic parsing for more accurate extraction across multiple languages.

settings.yaml

extract_graph_nlp:
  enabled: true
  parser: spacy_semantic
  spacy_model: en_core_web_md  # or any supported model

Uses spaCy’s context-free grammar parser for structured extraction.

settings.yaml

extract_graph_nlp:
  enabled: true
  parser: spacy_cfg
  spacy_model: en_core_web_md

SpaCy models requirementThis package requires SpaCy models to function correctly. If the required model is not installed, the package will automatically download and install it the first time it is used.You can install it manually by running:

python -m spacy download <model_name>
# Example:
python -m spacy download en_core_web_md

Recommended chunk size

For FastGraphRAG, it’s recommended to configure text chunking to produce much smaller chunks (50-100 tokens). This results in a better co-occurrence graph.

settings.yaml

chunks:
  size: 75  # Smaller chunks for FastGraphRAG
  overlap: 10

Choosing a method

Use this comparison table to decide which method is right for your use case:

Standard GraphRAG

Best for:

High-fidelity entity descriptions
Graph exploration and analysis
Rich semantic relationships
Production-quality knowledge graphs

Trade-offs:

Higher LLM costs (~75% of total indexing cost)
Slower processing time
Requires more tokens

FastGraphRAG

Best for:

Summary questions using global search
Cost-sensitive applications
Large-scale datasets
Rapid prototyping

Trade-offs:

Less directly relevant graph outside GraphRAG
Noisier graph structure
Entity descriptions are source text

Performance comparison

Graph extraction constitutes roughly 75% of indexing cost. FastGraphRAG is therefore much cheaper, but the tradeoff is that the extracted graph is less directly relevant for use outside of GraphRAG.

Cost estimation

Method	LLM API Calls	Relative Cost	Processing Time
Standard	High	100%	Slower
Fast	Low	~25%	Faster

Quality comparison

Aspect	Standard	Fast
Entity descriptions	Rich, LLM-generated	Source text only
Relationship descriptions	Rich, LLM-generated	Co-occurrence based
Graph quality	High fidelity	Noisier
Community reports	Description-based	Text-based
Graph exploration	Excellent	Good
Global search quality	Excellent	Excellent

Recommendations

When to use Standard GraphRAG

Choose Standard GraphRAG if:

You need high-fidelity entities and relationships
Graph exploration is important to your use case
You want to use the graph outside of GraphRAG queries
You’re building a production knowledge base
Cost is not the primary concern

When to use FastGraphRAG

Choose FastGraphRAG if:

Your primary use case is summary questions using global search
You’re working with large-scale datasets
You need to minimize LLM costs
You’re prototyping or experimenting
Processing speed is critical

Next steps

Data flow

Learn how data flows through each method

Configuration

Configure your chosen indexing method

Outputs

Understand the output schemas

Get Started

Core Concepts

Indexing

Query Engine

Prompt Tuning

Configuration

Guides

Standard GraphRAG

How it works

Usage

FastGraphRAG

How it works

Usage

NLP configuration options

Recommended chunk size

Choosing a method

Standard GraphRAG

FastGraphRAG

Performance comparison

Cost estimation

Quality comparison

Recommendations

Next steps

Data flow

Configuration

Outputs

Build docs developers (and LLMs) love

Get Started

Core Concepts

Indexing

Query Engine

Prompt Tuning

Configuration

Guides

​Standard GraphRAG

​How it works

​Usage

​FastGraphRAG

​How it works

​Usage

​NLP configuration options

​Recommended chunk size

​Choosing a method

Standard GraphRAG

FastGraphRAG

​Performance comparison

​Cost estimation

​Quality comparison

​Recommendations

​Next steps

Data flow

Configuration

Outputs

Build docs developers (and LLMs) love

Standard GraphRAG

How it works

Usage

FastGraphRAG

How it works

Usage

NLP configuration options

Recommended chunk size

Choosing a method

Performance comparison

Cost estimation

Quality comparison

Recommendations

Next steps