Skip to main content

Overview

Basic filtering operations allow you to clean up your repository by removing unwanted files, extracting specific paths, or restructuring your project layout. These are the most common use cases for git-filter-repo.

Path-Based Filtering

Keep Specific Files or Directories

To keep only specific paths in your repository history:
git filter-repo --path README.md
All file paths are relative to the repository root.

Remove Specific Files

Use --invert-paths to keep everything except the specified paths:
git filter-repo --path secrets.txt --path config/passwords.yml --invert-paths
This removes secrets.txt and config/passwords.yml from all history.

Combining Include and Exclude

Run filter-repo multiple times to combine inclusion and exclusion filters:
# First, keep only src/main/
git filter-repo --path src/main/

# Then, exclude test data files
git filter-repo --path-glob 'src/*/test-data' --invert-paths

Pattern Matching

Glob Patterns

Use glob patterns for flexible matching:
# Keep all Python files
git filter-repo --path-glob '*.py'
Quote glob patterns to prevent shell expansion.

Regular Expressions

For more complex matching, use regex patterns:
# Keep only files with YYYY-MM-DD.txt format at least 2 directories deep
git filter-repo --path-regex '^.*/.*/[0-9]{4}-[0-9]{2}-[0-9]{2}\.txt$'
See Python regex syntax for details.

Base Name Matching

Match on filename only (not full path):
# Keep all README.md and Makefile files anywhere in the repository
git filter-repo --use-base-name --path README.md --path Makefile
--use-base-name is incompatible with --path-rename and directory matching.

Subdirectory Operations

Extract Subdirectory as Root

Make a subdirectory the new repository root:
project/
  module/
    foo.c
    bar.c
  other/
    data.txt
  README.md
This is equivalent to:
git filter-repo --path module/ --path-rename module/:

Move Everything to Subdirectory

Place all repository contents under a new directory:
src/
  main.c
README.md
This is useful when preparing to merge repositories.

Path Shortcuts

Paths from File

For extensive filtering, list all paths in a file:
# Comments and blank lines are ignored
README.md
docs/
src/core/

# Glob pattern
glob:*.py

# Regex pattern
regex:^tests/.*test_.*\.py$

# Path rename
old-name/==>new-name/
Prefix lines with:
  • literal: (default) for exact matches
  • glob: for glob patterns
  • regex: for regular expressions
  • Use ==> to specify renames

Generate Paths from Current Files

Keep only currently tracked files (delete historical files):
git ls-files >../keep-files.txt
git filter-repo --paths-from-file ../keep-files.txt

Analysis Before Filtering

Analyze your repository to help decide what to filter:
git filter-repo --analyze
This creates reports in .git/filter-repo/analysis/:
  • blob-shas-and-paths.txt - Files by size
  • path-deleted-sizes.txt - Previously deleted large files
  • extensions-all-sizes.txt - Size by file extension
  • directories-all-sizes.txt - Size by directory
  • renames.txt - File rename history
Run --analyze both before and after filtering to verify results.

Common Examples

Remove Large Files

# Analyze to find large files
git filter-repo --analyze

# Review .git/filter-repo/analysis/path-all-sizes.txt
# Then remove specific large files
git filter-repo --invert-paths \
  --path large-dataset.zip \
  --path videos/presentation.mp4

Extract Specific Modules

# Extract just the API module
git filter-repo --path api/ --path docs/api/

Clean Up .DS_Store Files

git filter-repo --invert-paths \
  --path '.DS_Store' \
  --use-base-name

Important Notes

Fresh Clone RequiredAlways work in a fresh clone:
git clone --no-local /path/to/repo repo-to-filter
cd repo-to-filter
git filter-repo ...
Use --force to override the fresh clone check only if you’re certain.
Renames Are Not FollowedIf a file was renamed, specify both old and new paths:
git filter-repo --path oldname/ --path newname/
Sequential FilteringYou can run filter-repo multiple times in sequence for complex filtering:
git filter-repo --path src/
git filter-repo --path-glob 'tests/*' --invert-paths
git filter-repo --path-rename src/:lib/

Next Steps

Build docs developers (and LLMs) love