Repositories

What is a Git Repository?

A Git repository is a database that stores the complete history of your project, including all files, directories, commits, branches, and metadata. It’s the foundation of Git’s distributed version control system, enabling you to track changes, collaborate with others, and maintain a full record of your project’s evolution.

Every Git repository is self-contained and stores the entire project history locally, making Git a truly distributed version control system.

Repository Types

Git supports two main types of repositories:

Working Repository

A standard repository with a .git directory at the root of your working tree. This is what you typically create when starting a new project:

$ git init
Initialized empty Git repository in .git/

This creates:

A .git directory containing all Git internals

A working tree where you edit files

An index (staging area) to prepare commits

Bare Repository

A bare repository (typically named <project>.git) contains only the Git data without a working tree. These are commonly used as central repositories for collaboration:

$ git init --bare project.git

Bare repositories are ideal for:

Central servers where developers push and pull

Repositories that only serve as remote endpoints

Situations where no one directly edits files

Repository Structure

Inside the .git directory, Git maintains a well-defined structure:

Core Directory Structure

objects/

The object database stores all content: commits, trees (directories), blobs (files), and tag objects. Objects are identified by their SHA-1 hash.

objects/
├── [0-9a-f][0-9a-f]/  # First 2 chars of SHA-1
│   └── [38 chars]      # Remaining 38 chars
├── pack/               # Compressed object packs
└── info/               # Additional metadata

refs/

Stores references (pointers to commits):

refs/heads/ - Local branches
refs/tags/ - Tags
refs/remotes/ - Remote-tracking branches

HEAD

A symbolic reference pointing to your current branch:

ref: refs/heads/main

In detached HEAD state, it contains a commit SHA directly.

index

The staging area (covered in detail in the Staging Area concept). A binary file tracking what will go into your next commit.

config

Repository-specific configuration settings, including:

Remote repository URLs
Branch tracking information
User preferences for this repository

hooks/

Customization scripts that run at specific points in Git’s execution (e.g., pre-commit, post-merge).

The Object Database

Git’s object database is implemented in object-file.c and uses a content-addressable storage system. Every object has:

An ID - A 40-character SHA-1 hash of the object’s type and contents
A type - One of: commit, tree, blob, or tag
Contents - The actual data

Because objects are identified by their content hash, identical files share the same blob object, saving disk space across your entire repository history.

Object Storage Formats

Loose objects: Newly created objects are stored individually:

.git/objects/1b/61de420a21a2f1aaef93e38ecd0e45e8bc9f0a

Packed objects: Git periodically compresses multiple objects into pack files for efficiency:

.git/objects/pack/pack-<hash>.pack
.git/objects/pack/pack-<hash>.idx  # Index for fast lookup

Repository Layout Example

Here’s what a typical repository structure looks like:

my-project/
├── .git/
│   ├── HEAD                    # Current branch pointer
│   ├── config                  # Repository configuration
│   ├── description            # Repository description
│   ├── hooks/                 # Git hooks
│   ├── index                  # Staging area
│   ├── objects/               # Object database
│   │   ├── 1b/               # Object subdirectory
│   │   │   └── 61de420...    # Actual object file
│   │   ├── pack/             # Packed objects
│   │   └── info/             # Object metadata
│   ├── refs/                 # References
│   │   ├── heads/            # Local branches
│   │   │   └── main
│   │   ├── remotes/          # Remote branches
│   │   │   └── origin/
│   │   │       └── main
│   │   └── tags/             # Tags
│   └── logs/                 # Reflogs
│       ├── HEAD
│       └── refs/
├── src/                       # Working tree
├── README.md
└── .gitignore

Repository Operations

Creating a Repository

Initialize a new repository

$ git init

Clone an existing repository

$ git clone https://github.com/user/repo.git

This creates a complete copy including all history.

Repository Discovery

Git searches for a repository by looking for a .git directory in the current directory and then each parent directory. This is implemented in setup.c:

// Git walks up the directory tree looking for .git
struct repository *repo = discover_git_directory();

If Git can’t find a .git directory, commands will fail with “not a git repository”.

Gitfiles and Worktrees

Git supports a special mechanism called gitfiles where .git is a plain text file instead of a directory:

gitdir: /path/to/real/repository

This is used by:

Submodules - To allow the parent repository to remove submodule working trees without losing the repository
Worktrees - To enable multiple working directories sharing one repository

Object Reachability and Garbage Collection

Git only keeps objects that are reachable from:

References (branches, tags)
The reflog
The index

Unreachable objects may be deleted by git gc (garbage collection):

$ git gc
Counting objects: 2857, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (1234/1234), done.

Git automatically runs garbage collection periodically, but you can run it manually to optimize storage.

Repository Configuration

Repositories have three configuration levels:

System (/etc/gitconfig) - Applies to all users
Global (~/.gitconfig) - User-specific settings
Local (.git/config) - Repository-specific settings

Local settings override global, which override system:

$ git config --local user.email "[email protected]"
$ git config --global user.email "[email protected]"

Key Implementation Details

From repository.h and repository.c:

Repository struct - Core data structure managing repository state
Object database - Content-addressable storage with SHA-1 addressing
Reference storage - Multiple backends (files, reftable) for storing refs
Work tree - Association between repository and working directory

struct repository {
    struct object_odb *objects;  // Object database
    struct ref_store *refs;      // Reference storage
    struct index_state *index;   // Staging area
    char *worktree;              // Working tree path
};

Best Practices

Keep repositories focused

One repository per project or logical unit. Avoid creating mega-repositories unless using advanced features like sparse checkout.

Don't commit build artifacts

Use .gitignore to exclude generated files, dependencies, and build outputs from the repository.

Use bare repositories for sharing

When setting up a central repository, use --bare to prevent direct editing conflicts.

Regular maintenance

Periodically run git gc and git fsck to optimize storage and verify repository integrity.

Commits - Snapshots stored in the repository
Branches - Named pointers to commits
Staging Area - Preparing commits
Remote Repositories - Collaborating with others

Get Started

Core Concepts

Essential Commands

Advanced Topics

Configuration

Guides

Repositories

What is a Git Repository?

Repository Types

Repository Structure

objects/

refs/

HEAD

index

config

hooks/

The Object Database

Object Storage Formats

Repository Layout Example

Repository Operations

Creating a Repository

Repository Discovery

Gitfiles and Worktrees

Object Reachability and Garbage Collection

Repository Configuration

Key Implementation Details

Best Practices

Further Reading

Build docs developers (and LLMs) love

Get Started

Core Concepts

Essential Commands

Advanced Topics

Configuration

Guides

​What is a Git Repository?

​Repository Types

​Repository Structure

​objects/

​refs/

​HEAD

​index

​config

​hooks/

​The Object Database

​Object Storage Formats

​Repository Layout Example

​Repository Operations

​Creating a Repository

​Repository Discovery

​Gitfiles and Worktrees

​Object Reachability and Garbage Collection

​Repository Configuration

​Key Implementation Details

​Best Practices

​Related Concepts

​Further Reading

Build docs developers (and LLMs) love

What is a Git Repository?

Repository Types

Repository Structure

objects/

refs/

HEAD

index

config

hooks/

The Object Database

Object Storage Formats

Repository Layout Example

Repository Operations

Creating a Repository

Repository Discovery

Gitfiles and Worktrees

Object Reachability and Garbage Collection

Repository Configuration

Key Implementation Details

Best Practices

Related Concepts

Further Reading