update command incrementally updates an existing knowledge graph index with new documents, preserving existing entities and relationships while adding new ones.
Usage
Options
The project root directory containing the
settings.yaml configuration file and existing index.Aliases: -rThe indexing method to use for the update.Aliases:
-mAvailable methods:standard- Traditional GraphRAG indexing with LLM-based extractionfast- Fast indexing using NLP-based extraction
Run the update pipeline with verbose logging.Aliases:
-vUse LLM response caching to avoid redundant API calls.Use
--no-cache to disable caching.Skip any preflight validation checks. Useful when running without LLM steps.
Examples
Basic update
Update the index with new documents:Specify project directory
Use fast update method
Verbose logging
Disable caching
Output
By default, update outputs are saved to theupdate_output/ directory to avoid overwriting your existing index. This allows you to:
- Review the updated index before replacing the original
- Keep multiple versions of the index
- Safely roll back if needed
entities.parquetrelationships.parquetcommunities.parquetcommunity_reports.parquettext_units.parquetcovariates.parquet(if enabled)
How update works
The update process:- Load existing index - Reads the current index from the
output/directory - Process new documents - Identifies and processes new documents in the
input/directory - Merge entities - Integrates new entities with existing ones, resolving duplicates
- Update relationships - Adds new relationships and updates existing ones
- Recompute communities - Recalculates community structure with the updated graph
- Generate reports - Creates or updates community reports
- Save to update_output - Writes the updated index to
update_output/directory
Incremental vs. full reindex
Useupdate when:
- You have new documents to add to an existing index
- You want to preserve existing extracted entities and relationships
- You need faster processing for incremental changes
index when:
- Building a new index from scratch
- You’ve made significant changes to prompts or configuration
- You want to completely rebuild the knowledge graph
Configuration
You can customize the update output directory insettings.yaml:
Best practices
- Backup your index: Before running update, backup your
output/directory - Review updates: Check the
update_output/directory before replacing your production index - Consistent configuration: Use the same settings for update as you used for the initial index
- Monitor duplicates: Review merged entities to ensure proper deduplication
Merging updates into production
After reviewing the updated index:Performance considerations
- Update is generally faster than full reindex since it only processes new documents
- Community detection is recalculated on the full graph, which can take time for large indexes
- Caching helps avoid re-processing similar content
Error handling
The update command will exit with status code 1 if errors occur. Common issues:- Missing existing index: Ensure you’ve run
graphrag indexfirst - Incompatible schema: Use the same GraphRAG version for index and update
- Configuration mismatch: Maintain consistent settings between index and update operations