GitHub Connector
The GitHub connector provides access to GitHub content including individual files, release notes, and entire repositories.Import
GitHub File
Ingest a single file from a GitHub repository:Example
File Path Format
Path should be:owner/repo/path/to/file
Source ID
GitHub Releases
Ingest release notes from a GitHub repository:Basic Usage
Options
Stop at Specific Release
Include Drafts and Prereleases
Release Document Format
Each release is ingested as a document with:Example: Search Releases
Source ID
GitHub Repository
Ingest entire repository using gitingest:Basic Usage
Options
Include Patterns
Specify files to include:Exclude Patterns
Custom exclusions (default excludes common directories):Default Excludes
By default, these are excluded:**/node_modules/****/dist/****/coverage/****/*.test.tsand**/*.test.tsx**/.git/****/.github/****/.vscode/****/build/****/__tests__/****/*.d.ts
Specify Branch
Private Repositories
Use GitHub token for private repos:Repository URL Formats
Supported URL formats:Ingestion Strategy
Source ID
How It Works
The repository connector:- Uses
gitingestviauvxto generate a markdown digest - Applies include/exclude patterns
- Respects gitignore files (unless
includeGitignored: true) - Creates a single document containing the repository content
Implementation Details
File Fetching
Files are fetched via GitHub API:Release Pagination
Releases are paginated (100 per page, max 10 pages):Repository Processing
Repositories use gitingest:Examples
Ingest Multiple Files
Search Across Releases
Ingest Documentation
Rate Limiting
GitHub API has rate limits:- Unauthenticated: 60 requests/hour
- Authenticated: 5,000 requests/hour
Error Handling
Best Practices
Use Specific Include Patterns For repositories, be specific about what to include:ingestWhen: 'never' to avoid re-ingesting large repos:
untilTag to limit release ingestion:
Next Steps
RSS Connector
Ingest RSS feeds
Local Files
Work with local files
Ingestion
Learn about ingestion