Skip to main content
The graphrag init command is the easiest way to get started with GraphRAG. It creates a complete project structure with configuration files, environment variables, and default prompts.

Usage

graphrag init [--root PATH] [--force, --no-force]

Options

--root
string
default:"current directory"
The project root directory to initialize GraphRAG at
--force
boolean
default:"false"
Overwrite existing configuration and prompt files if they exist

Example

Initialize a new GraphRAG project in a specific directory:
graphrag init --root ./ragtest
This creates a complete project structure:
ragtest/
├── settings.yaml      # Main configuration file
├── .env              # Environment variables
└── prompts/          # LLM prompt templates
    ├── extract_graph.txt
    ├── summarize_descriptions.txt
    ├── community_report_graph.txt
    └── ...

Generated files

settings.yaml

The main configuration file containing all GraphRAG settings:
settings.yaml
### LLM settings ###
completion_models:
  default_completion_model:
    model_provider: openai
    model: gpt-4.1
    auth_method: api_key
    api_key: ${GRAPHRAG_API_KEY}
    retry:
      type: exponential_backoff

embedding_models:
  default_embedding_model:
    model_provider: openai
    model: text-embedding-3-large
    auth_method: api_key
    api_key: ${GRAPHRAG_API_KEY}
    retry:
      type: exponential_backoff

### Document processing settings ###
input:
  type: text  # [csv, text, json, jsonl]

chunking:
  type: tokens
  size: 1200
  overlap: 100
  encoding_model: o200k_base

### Storage settings ###
input_storage:
  type: file  # [file, blob, cosmosdb]
  base_dir: "input"

output_storage:
  type: file  # [file, blob, cosmosdb]
  base_dir: "output"

cache:
  type: json  # [json, memory, none]
  storage:
    type: file  # [file, blob, cosmosdb]
    base_dir: "cache"

vector_store:
  type: lancedb
  db_uri: output/lancedb

### Workflow settings ###
extract_graph:
  completion_model_id: default_completion_model
  prompt: "prompts/extract_graph.txt"
  entity_types: [organization, person, geo, event]
  max_gleanings: 1
See the settings reference for complete documentation of all available options.

.env

Environment variables file for storing sensitive credentials:
.env
GRAPHRAG_API_KEY=<API_KEY>
Never commit your .env file to version control. Add it to .gitignore to keep your API keys secure.

prompts/

The prompts directory contains default LLM prompt templates that you can customize:

extract_graph.txt

Prompt for extracting entities and relationships from text

summarize_descriptions.txt

Prompt for summarizing entity descriptions

community_report_graph.txt

Prompt for generating graph-based community reports

community_report_text.txt

Prompt for generating text-based community reports
Additional prompt files include:
  • extract_claims.txt - Claim extraction (disabled by default)
  • local_search_system_prompt.txt - Local search queries
  • global_search_map_system_prompt.txt - Global search mapping
  • global_search_reduce_system_prompt.txt - Global search reduction
  • drift_search_system_prompt.txt - DRIFT search queries
  • basic_search_system_prompt.txt - Basic search queries

Customizing prompts

You can modify the generated prompts to better suit your domain:
1

Edit prompt files

Open and modify the prompt files in the prompts/ directory
2

Test with your data

Run indexing to see how the modified prompts perform
3

Iterate and refine

Adjust prompts based on extraction quality and results
For data-adapted prompts, run the Auto Prompt Tuning command after initialization.

Using the —force flag

By default, init will not overwrite existing files. Use --force to regenerate:
graphrag init --root ./ragtest --force
The --force flag will overwrite:
  • settings.yaml
  • .env (your API keys will be lost)
  • All files in prompts/
Back up any customizations before using --force.

Next steps

After initialization, you’re ready to:

Configure API keys

Edit .env to add your language model API credentials

Add source documents

Place your documents in the input/ directory

Tune prompts

Run auto prompt tuning to adapt prompts to your data

Start indexing

Run graphrag index to process your documents

Build docs developers (and LLMs) love