Overview
Basic filtering operations allow you to clean up your repository by removing unwanted files, extracting specific paths, or restructuring your project layout. These are the most common use cases for git-filter-repo.
Path-Based Filtering
Keep Specific Files or Directories
To keep only specific paths in your repository history:
Single File
Multiple Paths
With Trailing Slash
git filter-repo --path README.md
All file paths are relative to the repository root.
Remove Specific Files
Use --invert-paths to keep everything except the specified paths:
git filter-repo --path secrets.txt --path config/passwords.yml --invert-paths
This removes secrets.txt and config/passwords.yml from all history.
Combining Include and Exclude
Run filter-repo multiple times to combine inclusion and exclusion filters:
# First, keep only src/main/
git filter-repo --path src/main/
# Then, exclude test data files
git filter-repo --path-glob 'src/*/test-data' --invert-paths
Pattern Matching
Glob Patterns
Use glob patterns for flexible matching:
Extension Match
Directory Pattern
Multiple Patterns
# Keep all Python files
git filter-repo --path-glob '*.py'
Quote glob patterns to prevent shell expansion.
Regular Expressions
For more complex matching, use regex patterns:
# Keep only files with YYYY-MM-DD.txt format at least 2 directories deep
git filter-repo --path-regex '^.*/.*/[0-9]{4}-[0-9]{2}-[0-9]{2}\.txt$'
See Python regex syntax for details.
Base Name Matching
Match on filename only (not full path):
# Keep all README.md and Makefile files anywhere in the repository
git filter-repo --use-base-name --path README.md --path Makefile
--use-base-name is incompatible with --path-rename and directory matching.
Subdirectory Operations
Make a subdirectory the new repository root:
project/
module/
foo.c
bar.c
other/
data.txt
README.md
git filter-repo --subdirectory-filter module/
This is equivalent to:
git filter-repo --path module/ --path-rename module/:
Move Everything to Subdirectory
Place all repository contents under a new directory:
git filter-repo --to-subdirectory-filter my-module/
my-module/
src/
main.c
README.md
This is useful when preparing to merge repositories.
Path Shortcuts
Paths from File
For extensive filtering, list all paths in a file:
# Comments and blank lines are ignored
README.md
docs/
src/core/
# Glob pattern
glob:*.py
# Regex pattern
regex:^tests/.*test_.*\.py$
# Path rename
old-name/==>new-name/
Prefix lines with:
literal: (default) for exact matches
glob: for glob patterns
regex: for regular expressions
Use ==> to specify renames
Generate Paths from Current Files
Keep only currently tracked files (delete historical files):
git ls-files > ../keep-files.txt
git filter-repo --paths-from-file ../keep-files.txt
Analysis Before Filtering
Analyze your repository to help decide what to filter:
git filter-repo --analyze
This creates reports in .git/filter-repo/analysis/:
blob-shas-and-paths.txt - Files by size
path-deleted-sizes.txt - Previously deleted large files
extensions-all-sizes.txt - Size by file extension
directories-all-sizes.txt - Size by directory
renames.txt - File rename history
Run --analyze both before and after filtering to verify results.
Common Examples
Remove Large Files
# Analyze to find large files
git filter-repo --analyze
# Review .git/filter-repo/analysis/path-all-sizes.txt
# Then remove specific large files
git filter-repo --invert-paths \
--path large-dataset.zip \
--path videos/presentation.mp4
# Extract just the API module
git filter-repo --path api/ --path docs/api/
Clean Up .DS_Store Files
Option 1: Base Name
Option 2: Glob
git filter-repo --invert-paths \
--path '.DS_Store' \
--use-base-name
git filter-repo --invert-paths \
--path-glob '*/.DS_Store' \
--path '.DS_Store'
Important Notes
Fresh Clone Required Always work in a fresh clone: git clone --no-local /path/to/repo repo-to-filter
cd repo-to-filter
git filter-repo ...
Use --force to override the fresh clone check only if you’re certain.
Renames Are Not Followed If a file was renamed, specify both old and new paths: git filter-repo --path oldname/ --path newname/
Sequential Filtering You can run filter-repo multiple times in sequence for complex filtering: git filter-repo --path src/
git filter-repo --path-glob 'tests/*' --invert-paths
git filter-repo --path-rename src/:lib/
Next Steps