Skip to main content
This page contains examples from real users who have filed issues or questions about git-filter-repo. These scenarios demonstrate practical solutions to common and uncommon repository filtering challenges.

Adding Files to Root Commits

Add a LICENSE file and .gitignore to the very first commit(s) in history:
git filter-repo --commit-callback "if not commit.parents: commit.file_changes += [
    FileChange(b'M', b'README.md', b'$(git hash-object -w '/path/to/existing/README.md')', b'100644'), 
    FileChange(b'M', b'src/.gitignore', b'$(git hash-object -w '/home/myusers/mymodule.gitignore')', b'100644')]"
The insert-beginning script is available in the contrib/filter-repo-demos/ directory.

Purging a Large List of Files

When you have many files to remove, create a text file with one path per line:
Create deletion list
cat > ../DELETED_FILENAMES.txt <<EOF
src/old-feature/
build/artifacts/
config/secrets.yml
EOF
Remove files
git filter-repo --invert-paths --paths-from-file ../DELETED_FILENAMES.txt

Extracting a Library from a Repo

Keep a subdirectory but rename it to a higher-level directory:
git filter-repo \
    --path src/some-folder/some-feature/ \
    --path-rename src/some-folder/some-feature/:src/
This is useful when splitting a monorepo or extracting a component that you want to become standalone.

Replace Words in Commit Messages

Replace “stuff” with “task” in all commit messages:
git filter-repo --message-callback 'return message.replace(b"stuff", b"task")'
For more complex replacements using regex:
git filter-repo --message-callback '
    import re
    return re.sub(b"JIRA-\\d+", b"PROJECT-\\1", message)
    '

Keep Files from Specific Branches Only

Delete all files except those currently present on two specific branches:
git ls-tree -r ${BRANCH1} >../my-files
git ls-tree -r ${BRANCH2} >>../my-files
sort ../my-files | uniq >../my-relevant-files
git filter-repo --paths-from-file ../my-relevant-files

Renormalize Line Endings

Convert all line endings and add a .gitattributes file:
contrib/filter-repo-demos/lint-history dos2unix
# Edit .gitattributes with desired settings
contrib/filter-repo-demos/insert-beginning .gitattributes

Remove Trailing Whitespace

Remove all spaces at the end of lines, including converting CRLF to LF:
git filter-repo --replace-text <(echo 'regex:[\r\t ]+(\n|$)==>\n')

Complex Include/Exclude Rules

Include all files under src/ except src/README.md:
git filter-repo --filename-callback '
    if filename == b"src/README.md":
        return None
    if filename.startswith(b"src/"):
        return filename
  return None'
This pattern is useful when you need both inclusion and exclusion logic that can’t be expressed with simple --path arguments.

Removing Paths by Extension

git filter-repo --invert-paths --path-glob '*.xsa'

Removing a Specific Directory

git filter-repo --path node_modules/electron/dist/ --invert-paths

Converting NFD Filenames to NFC

Mac systems use NFD (decomposed) Unicode normalization, which can cause issues. Convert to NFC (composed):
git filter-repo --filename-callback '
    try: 
        return subprocess.check_output("iconv -f utf-8-mac -t utf-8".split(),
                                       input=filename)
    except:
        return filename
'

Set Committer for Recent Commits

Change the committer of the last 5 commits:
git filter-repo --refs main~5..main --commit-callback '
    commit.committer_name = b"My Wonderful Self"
    commit.committer_email = b"[email protected]"
'

Handling Special Characters in Names

When dealing with names containing accents, umlauts, or other multi-byte characters:
git filter-repo --refs main~5..main --commit-callback '
    if commit.author_email == b"[email protected]":
        commit.author_name = "Raphaël González".encode()
        commit.author_email = b"[email protected]"
'
Python doesn’t allow multi-byte characters directly in bytestrings, so use .encode() to convert from a Unicode string.

Handling Repository Corruption

Corrupt Commit Objects

If git fsck reports corrupt commits:
Check for corruption
git fsck --full
Fix corrupt commit
# Extract the corrupt commit
git cat-file -p 166f57b3fbe31257100361ecaf735f305b533b21 >tmp

# Edit tmp to fix the error (e.g., add missing space)
# Then create a replacement:
git replace -f 166f57b3fbe31257100361ecaf735f305b533b21 \
    $(git hash-object -t commit -w tmp)

rm tmp
git filter-repo --proceed

Corrupt Tree Objects

For corrupt trees with duplicate entries:
Fix corrupt tree
# Extract the corrupt tree
git cat-file -p c15680eae81cc8539af7e7de766a8a7c13bd27df >tmp

# Edit tmp to remove duplicate entry
# Create replacement tree:
git mktree <tmp
# Output: ace04f50a5d13b43e94c12802d3d8a6c66a35b1d

git replace -f c15680eae81cc8539af7e7de766a8a7c13bd27df \
    ace04f50a5d13b43e94c12802d3d8a6c66a35b1d

rm tmp
git filter-repo --proceed
Create replacements for all corrupt objects before running git filter-repo.

Removing Files with Backslashes

Remove any file with a backslash in its path (common issue from Windows):
git filter-repo --filename-callback 'return None if b"\\" in filename else filename'

Replace a Binary Blob in History

Replace a sensitive image file throughout history:
git filter-repo --blob-callback '
    if blob.original_id == b"f4ede2e944868b9a08401dafeb2b944c7166fd0a":
        blob.data = open("../alternative-file.jpg", "rb").read()
'

Remove Old History (Commits Older Than N Days)

This changes every commit hash and permanently discards history. Only use if you’re certain this is what you want.
# Identify the old commit you want to become the new root
git replace --graft ${OLD_COMMIT}
git filter-repo --proceed
The git replace --graft command with no parent arguments converts ${OLD_COMMIT} into a root commit, effectively removing all its parents from history.

Replacing PNGs with Compressed Versions

If you committed large PNGs and later compressed them, you can retroactively use the compressed versions:
Identify blob IDs
git log -1 --raw --no-abbrev ${COMMIT_WHERE_YOU_COMPRESSED_PNGS}
This shows output like:
:100755 100755 edf570fde099c0705432a389b96cb86489beda09 9cce52ae0806d695956dcf662cd74b497eaa7b12 M      resources/foo.png
:100755 100755 644f7c55e1a88a29779dc86b9ff92f512bf9bc11 88b02e9e45c0a62db2f1751b6c065b0c2e538820 M      resources/bar.png
Replace old with new
git filter-repo --file-info-callback '
    if filename == b"resources/foo.png" and blob_id == b"edf570fde099c0705432a389b96cb86489beda09":
        blob_id = b"9cce52ae0806d695956dcf662cd74b497eaa7b12"
    if filename == b"resources/bar.png" and blob_id == b"644f7c55e1a88a29779dc86b9ff92f512bf9bc11":
        blob_id = b"88b02e9e45c0a62db2f1751b6c065b0c2e538820"
    return (filename, mode, blob_id)
'

Updating Submodule Hashes

If wrong submodule commit hashes were recorded, you can fix them:
git filter-repo --file-info-callback '
    if filename == b"src/my-submodule" and blob_id == b"edf570fde099c0705432a389b96cb86489beda09":
        blob_id = b"9cce52ae0806d695956dcf662cd74b497eaa7b12"
    if filename == b"src/my-submodule" and blob_id == b"644f7c55e1a88a29779dc86b9ff92f512bf9bc11":
        blob_id = b"88b02e9e45c0a62db2f1751b6c065b0c2e538820"
    return (filename, mode, blob_id)
'
blob_id is somewhat of a misnomer here since the file’s hash actually refers to a commit from the sub-project, but that’s the parameter name used by --file-info-callback.

Using Multi-line Strings in Callbacks

When callbacks add spaces at the front of every line, use textwrap.dedent:
Without dedent (incorrect)
git filter-repo --blob-callback '
  blob.data = bytes("""\
This is the new
file that I am
replacing every blob
with.  It is great.\n""", "utf-8")
'
# Results in unwanted leading spaces
With dedent (correct)
git filter-repo --blob-callback '
  import textwrap
  blob.data = bytes(textwrap.dedent("""\
    This is the new
    file that I am
    replacing every blob
    with.  It is great.\n"""), "utf-8")
'
# Results in clean output with no leading spaces

Build docs developers (and LLMs) love