The Four Strategies
1. Filtering
Remove noise (comments, whitespace, boilerplate) while preserving structure
2. Grouping
Aggregate similar items (files by directory, errors by type)
3. Truncation
Keep relevant context, cut redundancy (first/last lines, signatures only)
4. Deduplication
Collapse repeated patterns with counts (“[ERROR] … (×5)“)
Strategy Matrix
Different commands use different strategies:| Strategy | Used By | Technique | Reduction |
|---|---|---|---|
| Stats Extraction | git status, git log, pnpm | Count/aggregate, drop details | 90-99% |
| Error Only | runner (err mode) | stderr only, drop stdout | 60-80% |
| Grouping by Pattern | lint, tsc, grep | Group by rule/file/error code | 80-90% |
| Deduplication | log_cmd | Unique + count | 70-85% |
| Structure Only | json_cmd | Keys + types, strip values | 80-95% |
| Code Filtering | read, smart | Filter by level (none/minimal/aggressive) | 0-90% |
| Failure Focus | vitest, playwright, runner | Failures only, hide passing | 94-99% |
| Tree Compression | ls | Hierarchy with counts | 50-70% |
| Progress Filtering | wget, pnpm install | Strip ANSI, final result only | 85-95% |
| JSON/Text Dual | ruff, pip | JSON when available, text fallback | 80%+ |
| State Machine | pytest | Track test state, extract failures | 90%+ |
| NDJSON Streaming | go test | Line-by-line JSON parse | 90%+ |
Language-Aware Filtering
RTK’sfilter.rs module provides language-aware code filtering with three levels:
Filter Levels
None (0% reduction)
None (0% reduction)
Keep everything—raw file content.
Minimal (20-40% reduction)
Minimal (20-40% reduction)
Strip comments and normalize whitespace. Keep structure and code.
Aggressive (60-90% reduction)
Aggressive (60-90% reduction)
Strip comments and function bodies. Keep only signatures.
Language Support
RTK detects languages by file extension:| Language | Extensions | Comment Syntax |
|---|---|---|
| Rust | .rs | //, /* */, /// |
| Python | .py, .pyw | #, """ |
| JavaScript | .js, .mjs, .cjs | //, /* */ |
| TypeScript | .ts, .tsx | //, /* */ |
| Go | .go | //, /* */ |
| C/C++ | .c, .cpp, .h, .hpp | //, /* */ |
| Java | .java | //, /* */ |
| Ruby | .rb | #, =begin/=end |
| Shell | .sh, .bash, .zsh | # |
Usage Examples
When to use aggressive? When LLMs need to understand code structure but not implementation details. Perfect for “what functions exist?” queries.
Command-Specific Strategies
Git Operations
git status (Stats Extraction)
git status (Stats Extraction)
Raw output (50 lines, ~800 tokens):RTK output (1 line, ~20 tokens):Strategy:
- Count modified files: 3
- Count untracked files: 1
- Aggregate: “3 modified, 1 untracked”
- Token savings: 97%
git diff (Stats + Compact)
git diff (Stats + Compact)
Raw output (200+ lines, ~3000 tokens):RTK output (~30 lines, ~500 tokens):Strategy:
- Extract stats: +142/-89
- Group by file
- Show summary per file
- Token savings: 83%
git log (One-line summaries)
git log (One-line summaries)
Raw output (50 lines, ~1000 tokens):RTK output (5 lines, ~100 tokens):Strategy:
- Count commits: 5
- Extract total stats: +142/-89
- Show first line of each commit message
- Token savings: 90%
Testing
vitest/playwright (Failure Focus)
vitest/playwright (Failure Focus)
Raw output (200+ lines, ~4000 tokens):RTK output (~10 lines, ~200 tokens):Strategy:
- Hide passing tests (17 passed → omitted)
- Show only failures (2 failed)
- Include error messages for debugging
- Token savings: 95%
cargo test / go test (Failure Focus + NDJSON)
cargo test / go test (Failure Focus + NDJSON)
Raw output (100+ lines, ~2000 tokens):RTK output (~5 lines, ~100 tokens):Strategy (Rust):
- Hide passing tests (14 passed → omitted)
- Extract failure details (panic message, file:line)
- Token savings: 95%
- Parse line-by-line JSON events
- Track test state per package
- Aggregate failures only
- Token savings: 90%
Linting
ESLint/TSC/Ruff (Grouping by Rule)
ESLint/TSC/Ruff (Grouping by Rule)
Raw output (150 lines, ~2500 tokens):RTK output (~15 lines, ~300 tokens):Strategy:
- Parse error lines (regex:
file:line:col rule) - Group by rule (no-unused-vars: 23, semi: 45, …)
- Group by file (auth.ts: 8, db.ts: 15, …)
- Token savings: 88%
Logs & Data
Logs (Deduplication)
Logs (Deduplication)
Raw output (1000+ lines, ~20000 tokens):RTK output (~5 lines, ~100 tokens):Strategy:
- Identify repeated lines (exact match)
- Collapse with counts: ”(×127)”
- Keep first occurrence + count
- Token savings: 99%
JSON (Structure Extraction)
JSON (Structure Extraction)
Raw output (500 lines, ~10000 tokens):RTK output (~10 lines, ~200 tokens):Strategy:
- Parse JSON structure
- Extract keys + types
- Count array lengths
- Strip values (especially large strings/base64)
- Token savings: 98%
Advanced Patterns
State Machine Parsing (pytest)
Pytest output doesn’t have JSON mode—RTK uses a state machine to parse text:NDJSON Streaming (go test)
Go’s test runner outputs newline-delimited JSON with interleaved package events:Package Manager Detection (JS/TS)
Modern JS/TS commands auto-detect package managers:- CWD preservation: pnpm/yarn exec preserve working directory
- Monorepo support: Works in nested package.json structures
- No global installs: Uses project-local dependencies only
Choosing the Right Strategy
Determine information density
High density (code) → filtering. Low density (test results) → failure focus.
Best Practices
Prioritize structure
Keep structure (function signatures, file paths) and drop details (implementations, values)
Focus on failures
LLMs need to see errors, not successes. Hide passing tests, show only failures.
Use JSON when available
Structured formats (JSON, NDJSON) are easier to parse and compress than text.
Preserve exit codes
Always propagate exit codes for CI/CD reliability. Filter output, not behavior.
