Skip to main content
GraphRAG follows semantic versioning and provides migration paths for upgrading between versions. This guide helps you navigate breaking changes and upgrade your projects smoothly.

Versioning approach

GraphRAG follows semantic versioning with some specific considerations:

CLI

Conforms to standard semver

API

Conforms to standard semver

settings.yaml

Changes result in minor version bump

Data model

Conforms to standard semver

Internals

May change without semver compliance
Always run graphrag init --root [path] --force between minor version bumps to ensure you have the latest config format. Back up your customizations first.

General upgrade process

1

Back up your project

cp -r ./my-project ./my-project-backup
Especially important:
  • settings.yaml
  • prompts/ directory
  • .env file
2

Upgrade GraphRAG

pip install --upgrade graphrag
3

Check version

pip show graphrag
4

Update configuration

For minor/major version bumps:
graphrag init --root ./my-project --force
Then restore your customizations from backup.
5

Run migration (major versions only)

For major version upgrades, run the migration notebook (see version-specific sections below).

Migration to v3

GraphRAG v3 streamlined the core library by removing rarely-used features and simplifying configuration.

Overview

Migration notebook: docs/examples_notebooks/index_migration_to_v3.ipynb Main goals:
  • Slim down maintenance overhead
  • Remove out-of-scope features
  • Simplify configuration model

Data model changes

The primary breaking change affects the text_units table: Before v3:
# text_units had document_ids (plural) - a list
text_unit = {
    "id": "unit1",
    "text": "...",
    "document_ids": ["doc1", "doc2"]  # List
}
After v3:
# text_units has document_id (singular)
text_unit = {
    "id": "unit1",
    "text": "...",
    "document_id": "doc1"  # Single value
}
The migration notebook handles this transformation automatically - you don’t need to re-index.

API changes

Removed multi-search variants: Removed (no longer available):
# These are gone in v3
from graphrag.api import multi_global_search  # ❌
from graphrag.api import multi_local_search   # ❌
Use instead:
# Single search methods remain
from graphrag.api import global_search  # ✓
from graphrag.api import local_search   # ✓

Configuration changes

Before v3 (fnllm-based):
llm:
  type: openai_chat  # ❌ No longer valid

embedding:
  type: azure_openai_embedding  # ❌ No longer valid
After v3 (LiteLLM-based):
llm:
  type: chat  # ✓ Generic type
  model_provider: openai  # Specify provider

embedding:
  type: embedding  # ✓ Generic type
  model_provider: azure  # Specify provider
Before v3:
llm:
  rate_limiting: auto  # ❌ No longer supported
After v3:
llm:
  requests_per_minute: 60  # ✓ Explicit limits
  tokens_per_minute: 80000
  # Or use null for no limiting
  requests_per_minute: null
Before v3:
# Nested dict for multi-search support
vector_store:
  entity_description:
    type: lancedb
    db_uri: ./lancedb
  community_full_content:
    type: lancedb
    db_uri: ./lancedb

outputs:
  # Multi-search output configuration
  entity_description:
    type: parquet
After v3:
# Simplified single root-level object
vector_store:
  type: lancedb
  db_uri: ./lancedb
  
  # Optional custom schema
  index_schema:
    entity_description:
      index_name: entities
    community_full_content:
      index_name: communities

# No outputs block needed
The following configuration blocks have been removed:
# ❌ All removed in v3

umap:  # Removed - use Gephi for visualization
  enabled: false

embed_graph:  # Removed - no longer generates x/y positions
  enabled: false

workflows:
  entity_extraction:
    strategy:  # Removed - unused complexity
      type: nltk

input:
  file_filter:  # Removed - essentially unused
    include: ["*.txt"]

chunking:
  group_by_columns:  # Removed - unused grouping feature
    - document_type

Migration steps

1

Run migration notebook

Navigate to the migration notebook and execute all cells:
jupyter notebook docs/examples_notebooks/index_migration_to_v3.ipynb
This transforms your existing tables to the v3 format.
2

Update configuration

graphrag init --root ./my-project --force
3

Restore customizations

Manually copy over your custom settings:
  • API keys from .env
  • Model names
  • Custom prompts
  • Rate limits based on your quota
  • Provider-specific settings
4

Update API calls (if using Python API)

Remove any multi_*_search calls and replace with single search methods.
5

Test the migration

Run a query to verify everything works:
graphrag query "test query" --root ./my-project --method global

Migration to v2

GraphRAG v2 renamed index tables for clarity.

Overview

Migration notebook: docs/examples_notebooks/index_migration_to_v2.ipynb

Table renames

All tables were renamed to simply describe their contents:
Old Name (v1)New Name (v2)
create_final_entitiesentities
create_final_nodesnodes
create_final_communitiescommunities
create_final_community_reportscommunity_reports
create_final_text_unitstext_units
create_final_relationshipsrelationships
create_final_documentsdocuments

Migration steps

1

Run migration notebook

jupyter notebook docs/examples_notebooks/index_migration_to_v2.ipynb
2

Update configuration

graphrag init --root ./my-project --force
3

Verify table names

Check your output directory - tables should have new names:
ls ./my-project/output/*.parquet

Migration to v1

GraphRAG v1 introduced vector stores and streamlined the data model.

Overview

Migration notebook: docs/examples_notebooks/index_migration_to_v1.ipynb

Major changes

v1 requires a vector store for embeddings.New configuration:
vector_store:
  type: lancedb
  db_uri: ./lancedb
Default uses local LanceDB. For production, consider Azure AI Search.
ID fields:
  • Consistent use of id and human_readable_id
  • Integer IDs stored as ints (not strings)
Field renames:
  • document.raw_contentdocument.text
  • entity.nameentity.title
  • relationship.rankrelationship.combined_degree
Removed fields:
  • relationship.source_degree
  • relationship.target_degree
  • All embedding columns (now in vector store)
Community IDs:
  • id now uses proper UUID
  • community and human_readable_id retain short IDs
v1 added embeddings for DRIFT search and base RAG:
  • entity_description embeddings
  • community_full_content embeddings
  • text_unit_text embeddings
Before v1:
storage:
  base_dir: "output/${timestamp}/artifacts"  # ❌

reporting:
  base_dir: "output/${timestamp}/reports"  # ❌
After v1:
storage:
  base_dir: "output"  # ✓ Static path

reporting:
  base_dir: "output"  # ✓ Static path

Migration steps

1

Update configuration

graphrag init --root ./my-project --force
Note the new vector_store configuration block.
2

Remove timestamp paths

Edit settings.yaml or environment variables:
# In .env
GRAPHRAG_STORAGE_BASE_DIR=output
GRAPHRAG_REPORTING_BASE_DIR=output
3

Run migration notebook

jupyter notebook docs/examples_notebooks/index_migration_to_v1.ipynb
4

Re-index with vector store

Run indexing to populate the vector store:
graphrag index --root ./my-project
This leverages your existing cache for LLM calls.

Best practices

Always backup before upgrading

# Complete project backup
tar -czf my-project-backup-$(date +%Y%m%d).tar.gz ./my-project

# Or selective backup
mkdir -p backups
cp settings.yaml backups/settings.yaml.$(date +%Y%m%d)
cp -r prompts backups/prompts.$(date +%Y%m%d)
cp .env backups/.env.$(date +%Y%m%d)

Test on a copy first

cp -r ./my-project ./my-project-test
cd ./my-project-test
# Upgrade and test here first

Use cache to avoid re-indexing costs

GraphRAG’s cache prevents redundant LLM calls:
settings.yaml
cache:
  type: file
  base_dir: ./cache
After migration, re-indexing will use cached LLM responses, saving time and money.

Track your version

Add version info to your project:
settings.yaml
name: "my-project"
# Add version metadata
metadata:
  graphrag_version: "3.0.0"
  last_updated: "2024-03-15"
  migration_notes: "Migrated from v2 to v3"

Read release notes

Before upgrading, review:

Troubleshooting

Possible causes:
  • Corrupted parquet files
  • Missing columns
  • Incompatible data types
Solutions:
  • Check notebook output for specific error
  • Verify parquet files can be read: pd.read_parquet("output/entities.parquet")
  • Re-index from scratch if data is corrupted
Solution:
  • Run dry-run to identify issues:
graphrag index --root ./my-project --dry-run --verbose
  • Compare your config to the latest template
  • Check for removed or renamed settings
Common issues:
  • Old parquet file names
  • Missing vector store setup
  • Incompatible data schema
Solutions:
  • Run the appropriate migration notebook
  • Verify vector store configuration
  • Re-index if needed
v3 specific:
# This will fail in v3
from graphrag.api import multi_global_search  # ❌

# Use this instead
from graphrag.api import global_search  # ✓
Update all import statements to use single search methods.

Version compatibility matrix

GraphRAG VersionPython VersionKey FeaturesData Model Version
3.x≥3.10LiteLLM, simplified configv3
2.x≥3.10Renamed tablesv2
1.x≥3.10Vector stores, streamlined modelv1
<1.0≥3.10Pre-releasev0

Getting help

If you encounter issues during migration:
  1. Check the breaking changes document
  2. Review GitHub Issues
  3. Ask in GitHub Discussions
  4. Consult version-specific migration notebooks

Next steps

Configuration

Learn about all config options

Best practices

Optimize your implementation

CLI usage

Master the command-line interface

Python API

Use GraphRAG programmatically

Build docs developers (and LLMs) love