Question 1

Why does my Git repository keep growing in size?

Accepted Answer

Git repositories grow over time because Git stores the complete history of every file change as compressed objects in the .git directory. Each commit creates new tree and blob objects representing the state of changed files, and even though Git uses delta compression to store only differences between versions, these deltas accumulate over thousands of commits. Binary files like images, compiled binaries, and archives contribute disproportionately to repository growth because they do not delta-compress well. Large repositories often contain accidentally committed build artifacts, node_modules directories, or database dumps that remain in history even after being removed from the working tree. Running git gc periodically helps by repacking loose objects and pruning unreachable objects.

Question 2

What is the difference between repository size and clone size?

Accepted Answer

Repository size refers to the .git directory which contains all Git objects (commits, trees, blobs, tags), pack files, refs, and configuration. Clone size includes the .git directory plus the working tree (all files checked out at the current HEAD). When you clone a repository, Git downloads the entire .git directory and then checks out the default branch, creating the working tree. The working tree size is simply the sum of all tracked files at the current revision. For repositories with large histories but small current codebases, the .git directory can be many times larger than the working tree. Shallow clones (git clone --depth 1) reduce clone size dramatically by downloading only the most recent commit history.

Question 3

How does Git delta compression reduce repository size?

Accepted Answer

Git uses delta compression in pack files to store objects as differences from similar objects rather than as complete copies. When Git runs its garbage collection (git gc), it identifies similar blobs and stores one as a base object and the others as delta instructions describing how to reconstruct them from the base. For text files, where changes between versions are typically small, delta compression can achieve 80-95% space savings compared to storing full copies. Git is intelligent about choosing base objects, often selecting the most recent version as the base since it is most frequently accessed. Pack files also use zlib compression on top of delta encoding. However, binary files like images and compiled code do not delta-compress well because small logical changes often result in completely different byte sequences.

Question 4

When should I use Git Large File Storage (Git LFS)?

Accepted Answer

Git LFS should be used when your repository contains large binary files that change frequently, such as design assets (PSD, AI, Sketch files), video and audio files, compiled binaries, large datasets, or 3D models. The general threshold is any binary file larger than 500 KB that will have multiple versions over the project lifetime. Git LFS works by replacing large files in your repository with small pointer files while storing the actual file content on a separate LFS server. This keeps the Git repository small and fast to clone while still versioning large files. Without LFS, a repository with just ten 50 MB binary files that each have 20 revisions would consume roughly 10 GB of Git storage, while the same files through LFS would add negligible size to the repository itself.

Git Repository Size Calculator

Formula

Worked Examples

Example 1: Small Web Project - 1 Year History

Example 2: Large Project with Binary Assets

Frequently Asked Questions

Why does my Git repository keep growing in size?

What is the difference between repository size and clone size?

How does Git delta compression reduce repository size?

When should I use Git Large File Storage (Git LFS)?

References