Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 807445
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T00:23:57+00:00 2026-05-15T00:23:57+00:00

I just saw the first Git tutorial at http://blip.tv/play/Aeu2CAI . How does Git store

  • 0

I just saw the first Git tutorial at http://blip.tv/play/Aeu2CAI.

How does Git store all the versions of all the files, and how can it still be more economical in space than Subversion which saves only the latest version of the code?

I know this can be done using compression, but that would be at the cost of speed, but this also says that Git is much faster (though where it gains the maximum is the fact that most of its operations are offline).

So, my guess is that

  • Git compresses data extensively
  • It is still faster because uncompression + work is still faster than network_fetch + work

Am I correct? Even close?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T00:23:58+00:00Added an answer on May 15, 2026 at 12:23 am

    I assume you are asking how it is possible for a git clone (full repository + checkout) to be smaller than checked-out sources in Subversion. Or did you mean something else?

    This question is answered in the comments


    Repository size

    First you should take into account that along checkout (working version) Subversion stores pristine copy (last version) in those .svn subdirectories. Pristine copy is stored uncompressed in Subversion.

    Second, git uses the following techniques to make repository smaller:

    • each version of a file is stored only once; this means that if you have only two different versions of some file in 10 revisions (10 commits), git stores only those two versions, not 10.
    • objects (and deltas, see below) are stored compressed; text files used in programming compress really well (around 60% of original size, or 40% reduction in size from compression)
    • after repacking, objects are stored in deltified form, as a difference from some other version; additionally git tries to order delta chains in such a way that the delta consists mainly of deletions (in the usual case of growing files it is in recency order); IIRC deltas are compressed as well.

    Performance (speed of operations)

    First, any operation that involves network would be much slower than a local operation. Therefore for example comparing current state of working area with some other version, or getting a log (a history), which in Subversion involves network connection and network transfer, and in Git is a local operation, would of course be much slower in Subversion than in Git. BTW. this is the difference between centralized version control systems (using client-server workflow) and distributed version control systems (using peer-to-peer workflow), not only between Subversion and Git.

    Second, if I understand it correctly, nowadays the limitation is not CPU but IO (disk access). Therefore it is possible that the gain from having to read less data from disk because of compression (and being able to mmap it in memory) overcomes the loss from having to decompress data.

    Third, Git was designed with performance in mind (see e.g. GitHistory page on Git Wiki):

    • The index stores stat information for files, and Git uses it to decide without examining files if the files were modified or not (see e.g. core.trustctime config variable).
    • The maximum delta depth is limited to pack.depth, which defaults to 50. Git has delta cache to speed up access. There is (generated) packfile index for fast access to objects in packfile.
    • Git takes care to not touch files it doesn’t have to. For example when switching branches, or rewinding to another version, Git updates only files that changed. The consequence of this philosophy is that Git does support only very minimal keyword expansion (at least out of the box).
    • Git uses its own version of LibXDiff library, nowadays also for diff and merge, instead of calling external diff / external merge tool.
    • Git tries to minimize latency, which means good perceived performance. For example it outputs first page of “git log” as fast as possible, and you see it almost immediately, even if generating full history would take more time; it doesn’t wait for full history to be generated before displaying it.
    • When fetching new changes, Git checks what objects you have in common with the server, and sends only (compressed) differences in the form of thin packfile. Admittedly Subversion can (or perhaps by default it does) also send only differences when updating.

    I am not a Git hacker, and I probably missed some techniques and tricks that Git uses for better performance. Note however that Git heavily uses POSIX (like memory mapped files) for that, so the gain might be not as large on MS Windows.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 484k
  • Answers 484k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer A JavaBean on its own is not terribly interesting, it's… May 16, 2026 at 7:18 am
  • Editorial Team
    Editorial Team added an answer Use each_line instead of each. String#each has been removed in… May 16, 2026 at 7:18 am
  • Editorial Team
    Editorial Team added an answer No. Unfortunately for you, there is nothing similar in ObjectiveC.… May 16, 2026 at 7:18 am

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.