Overview
Routa’s GitHub Virtual Workspace feature allows you to import GitHub repositories for browsing, code review, and analysis without cloning them locally. This is particularly useful for:- Serverless deployments (Vercel) - No local file system required
- Code review workflows - Inspect PRs and issues without checkout
- Repository exploration - Browse unfamiliar codebases quickly
- Security analysis - Examine code without executing it
How It Works
Extract to temporary directory
The archive is extracted to
/tmp/routa-gh/{owner}--{repo} (or in-memory for serverless).Build file index
A searchable file tree is created, excluding common directories like
node_modules and .git.Architecture
Implementation: src/core/github/github-workspace.tsKey Features
- Caching: Workspaces are cached in-memory for 1 hour (configurable TTL)
- Serverless compatible: Works on Vercel with zipball download
- Security: Path traversal protection (files must be within workspace)
- Performance: File index built once, reused for searches
Usage
Via REST API
Import a Repository
List Active Workspaces
Get File Tree
Read File
Search Files
Delete Workspace
Via TypeScript SDK
Configuration
Environment Variables
.env
Size Limits
Private Repositories
Provide a GitHub personal access token withrepo scope:
- Public repos work fine (subject to rate limits)
- Private repos return 404
Ignored Patterns
These directories are automatically excluded from the file tree:.gitignore, .env) are excluded from the tree but can still be read via readFile().
Use Cases
Code Review Workflow
Review a pull request without checking out the branch:Issue Triage
Enrich GitHub issues with code context:Dependency Analysis
Analyze dependencies without cloning:Security Scanning
Scan for security issues without executing code:Documentation Generation
Generate docs from source code:Serverless Deployment (Vercel)
GitHub Virtual Workspace is designed for serverless environments:/tmphas limited space (~512MB on Lambda, ~500MB on Vercel)- Workspaces are not shared across Lambda instances
- TTL-based cleanup prevents disk exhaustion
Caching and Cleanup
In-Memory Cache
Workspaces are cached in a global registry:TTL Configuration
Default TTL: 1 hour (3600000ms) Change via environment variable:Manual Cleanup
Error Handling
Security Considerations
Path Traversal Protection
Path Traversal Protection
All file reads are validated to prevent path traversal:
Size Limits
Size Limits
Prevent denial-of-service via large repositories:
Token Handling
Token Handling
- Never log or expose GitHub tokens
- Use environment variables, not hardcoded values
- Grant minimal scopes (only
repofor private repos)
TTL-Based Cleanup
TTL-Based Cleanup
Workspaces expire after 1 hour to:
- Free up disk space
- Prevent stale data
- Reduce attack surface
Performance Optimization
File Index Caching
The file tree is built once and reused:Fuzzy Search Optimization
Ignored Patterns
Excludingnode_modules, .git, etc. reduces:
- Index size (fewer files to search)
- Extraction time (fewer files to write)
- Memory usage (smaller tree)
Troubleshooting
404: Repository not found
404: Repository not found
- Verify owner/repo are correct
- Check if repo is private (requires
GITHUB_TOKEN) - Ensure token has
reposcope
403: Forbidden or rate limited
403: Forbidden or rate limited
- Add
GITHUB_TOKENto increase rate limits - Use a token with sufficient scopes
- Wait for rate limit reset
Repository too large
Repository too large
- Increase
maxSizeMBlimit - Import a specific branch/tag instead of default
- Exclude large directories (e.g., docs, examples)
Extraction fails on serverless
Extraction fails on serverless
- Check
/tmpspace usage - Reduce
maxSizeMBlimit - Clean up expired workspaces manually
File not found after import
File not found after import
- Check if file is in ignored patterns
- Verify file exists in the specified ref
- Use
workspace.exists()beforereadFile()
Next Steps
Custom Specialists
Create code review specialists
Workflows
Automate GitHub workspace operations
Web Deployment
Deploy on Vercel with GitHub integration
GitHub API
GitHub REST API documentation