contrib/filter-repo-demos/ directory contains several example scripts that demonstrate how to use git-filter-repo as a Python library. These scripts showcase the flexibility of filter-repo for creating custom history rewriting tools.
All scripts require a symlink to
git-filter-repo named git_filter_repo.py in your PYTHONPATH to run. Each script has a --help flag for detailed usage information.barebones-example
A minimal starting point showing the basic imports and structure needed to use git-filter-repo as a library.Purpose
Demonstrates the simplest possible filter-repo program with no modifications to default behavior. Use this as a template for building your own custom tools.Usage
Key Code
insert-beginning
Adds a file (like LICENSE or COPYING) to the very first commit(s) in repository history.Purpose
Useful when you need to retroactively add a license file, .gitignore, or other configuration file to the beginning of your project’s history.Usage
Features
- Adds the specified file to all root commits
- Preserves original file permissions (executable bit)
- Automatically rewrites commit hashes in commit messages
Example
How It Works
The script uses a commit callback that checks if a commit has no parents (i.e., it’s a root commit), then adds aFileChange entry for the specified file:
signed-off-by
AddsSigned-off-by: tags to a range of commits.
Purpose
Automatically adds sign-off trailers to commits, useful for projects requiring Developer Certificate of Origin (DCO) sign-offs.Usage
Features
- Uses
git config user.nameanduser.emailfor the sign-off - Intelligently places trailers (adjacent to existing trailers, separated by blank line otherwise)
- Won’t duplicate if
Signed-off-by:already exists
Note
git rebase --signoff provides similar functionality and is generally recommended for new commits. This script is primarily a demonstration of what can be done with filter-repo.How It Works
lint-history
Runs a linting or formatting program on all files in history.Purpose
Apply code formatters, linters, or any file transformation tool across your entire repository history. This is incredibly powerful for retroactively enforcing code style.Usage
Options
Python code to determine whether to apply linter to a given filename. Implies
--filenames-important.Example: 'return filename.endswith(b".txt")'Write files with their original basename (needed if linter requires correct file extension).
Limit to specified refs (implies
--partial).Features
- Only processes files that match your criteria
- Preserves filenames when needed for extension-aware tools
- Automatically rewrites commit hashes in messages
- Uses temporary directory for file operations (configurable via
$TMPDIR)
Performance Note
Unlike filter-branch which would run the linter on every file in every commit, lint-history only processes each unique blob once, making it significantly faster.Advanced Examples
See GitHub issue #45 for community modifications for Python files, Jupyter notebooks, Java files, and more.clean-ignore
Deletes files from history that match current.gitignore rules.
Purpose
Remove files that should never have been committed because they match your current gitignore patterns.Usage
Features
- Uses
git check-ignoreto determine which files to remove - Respects negation patterns (lines starting with
!) - Prunes commits that become empty
- Prunes merge commits that become degenerate
- Rewrites commit hashes in commit messages
How It Works
The script maintains two sets (ignored and okay) and queriesgit check-ignore for each file:
Example Workflow
filter-lamely (filter-branch-ish)
A bug-compatible reimplementation ofgit filter-branch, but faster.
Purpose
Demonstrates that filter-repo can emulate even legacy tools like filter-branch. The git regression test suite passes when using filter-lamely instead of filter-branch.Why It Exists
- Proof of concept that filter-repo can implement any history rewriting tool
- Provides migration path for existing filter-branch users
- Shows how much faster filter-repo is even when constrained by filter-branch’s API
Intentional Differences
--tree-filterand--index-filteronly operate on changed files (performance optimization)- Simplified
map()function (ignores writing mapping file) - No
--parent-filter(obsoleted bygit replace --graft)
Usage
Replacegit filter-branch with ./filter-lamely in your existing commands:
bfg-ish
A reimplementation of BFG Repo Cleaner with bug fixes and new features.Purpose
Provides BFG-compatible interface with improvements: New Features:- Automatic repacking (more robust than BFG)
- Prunes commits that become empty
- Creates replace refs for old commit hashes
- Respects grafts and replace refs
- Auto-updates commit encoding to UTF-8
- Works with loose objects (not just packfiles)
- Works with loose refs (not just packed-refs)
- Works with replace refs
- Updates index and working tree at end
--no-blob-protectionmade safe and default
Usage
Replacejava -jar bfg.jar with ./bfg-ish:
Note from Author
Even with all these improvements, I think filter-repo is the better tool, and thus I suggest folks use it. I have no plans to improve bfg-ish further. However, bfg-ish serves as a nice demonstration of the ability to use filter-repo to write different filtering tools.
convert-svnexternals
Inserts Git submodules according to SVN externals definitions from converted Subversion repositories.Purpose
When migrating from SVN to Git using SubGit, this script convertssvn:externals properties to proper Git submodules throughout history.
Prerequisites
- Repository must be converted using SubGit
- SubGit must be run with
translate.externals=trueconfig option - Creates
.gitsvnextmodulesfile during conversion - Requires SVN-to-Git mapping file
Usage
Features
- Inserts gitlinks (mode 160000) into trees
- Adds
.gitmodulesfile with relevant sections - Removes converted sections from
.gitsvnextmodules - Handles repeatedly added/removed externals
- Handles externals replaced by direct files
Mapping File Format
Caveats
Purpose of These Scripts
These scripts are not meant to be complete production tools, but rather demonstrations of filter-repo’s library capabilities. They show that extremely varied history rewriting tools can be created that automatically inherit:
- Rewriting hashes in commit messages
- Pruning commits that become empty
- Handling filenames with special characters
- Non-standard encodings
- Handling of replace refs
- And much more
Using the Scripts
Setup
-
Create a symlink to
git-filter-repo: -
Add to your
PYTHONPATH: -
Make scripts executable:
Getting Help
Every script supports--help:
