Local Files Connector
The local files connector ingests files from your local filesystem using glob patterns, with automatic gitignore support.Import
Basic Usage
Configuration
Pattern
Glob pattern to match files:Current Working Directory
Base directory for the glob pattern:process.cwd().
Ingestion Strategy
Control when to ingest:contentChanged- Always ingest, skip unchanged filesnever- Only ingest if source doesn’t existexpired- Only ingest if expired
Expiry
Set content expiration:Glob Patterns
All Files of Type
Specific Directory
Multiple Extensions
Use brace expansion:Specific Files
Excluding Patterns
Use negation (handled by gitignore):Gitignore Support
The connector automatically respects.gitignore files:
How It Works
- Collect Patterns - Read all
.gitignorefiles from root to target - Filter Files - Exclude files matching gitignore patterns
- Cache Patterns - Cache for performance
Example
Given.gitignore:
node_modules/**dist/***.log
Additional Exclusions
These are always excluded:**/node_modules/****/.git/****/.DS_Store**/Thumbs.db**/*.tmp**/*.temp**/coverage/****/dist/****/build/**
Source ID
glob:{pattern}
Document IDs
Document IDs are absolute file paths:Examples
Ingest Documentation
Ingest Source Code
Multiple Patterns
Ingest from multiple patterns:Search Documentation
One-Time Ingestion
Time-Based Re-ingestion
Performance
File Filtering
Files are filtered efficiently:- Fast-glob - Fast file matching
- Gitignore Cache - Cached pattern matching
- Directory Grouping - Optimize gitignore reads
Large Directories
For large codebases, use specific patterns:Error Handling
File Read Errors
Empty string fallback for read errors:No Files Found
Pattern Errors
Working Directory
Thecwd option sets the base directory:
Symbolic Links
Symbolic links are not followed:Hidden Files
Dot files are excluded by default:Change Detection
Files are automatically compared using content hashing:Best Practices
Use Specific Patterns Be specific to reduce file scanning:.gitignore to exclude files:
cwd for cleaner patterns:
- Static content:
ingestWhen: 'never' - Dynamic content:
ingestWhen: 'contentChanged' - Time-sensitive:
ingestWhen: 'expired'
Next Steps
PDF Connector
Ingest PDF documents
GitHub Connector
Ingest from GitHub
Search
Search ingested files