As a recent question hinted, I’m looking for a way to speed up operations on a Git repository with a very large number of files (~6 million). I’d rather not use submodules. The problem is that operations are pretty slow. Is it possible to have one large repository but instruct Git to focus on only a portion of the repository? I thought that maybe creating a sparse-checkout would do it but the read-tree operation seems to delete files not specified in the sparse-checkout file and takes a really long time. Is it possible to do a read-tree keeping all the files where they are and is proportional only to the number of files specified in the sparse-checkout file?
Share
Not currently, no. Git only recently (1.7+) added any sparse checkout support at all, and it’s still fairly bare bones – mostly because Git wasn’t really designed to handle only working with part of a repository.
It was more designed to be a one-repository-per-project version control system. Submodules were the method chosen to handle “projects” that had many large subcomponents.