In the command line, if I type
git tag --contains {commit}
to obtain a list of releases that contain a given commit, it takes around 11 to 20 seconds for each commit. Since the target code base there exists more than 300,000 commits, it would take a lot to retrieve this information for all commits.
However, gitk apparently manages to do a good job retrieving this data. From what I searched, it uses a cache for that purpose.
I have two questions:
- How can I interpret that cache format?
- Is there a way to obtain a dump from the
gitcommand line tool to generate that same information?
You can get this almost directly from
git rev-list.latest.awk:a sample command:
you can also use
--topo-order, and you’ll probably have to weed out unwanted refs in the$1!="commit"logic.Depending on what kind of transitivity you want and how explicit the listing has to be, accumulating the tags might need a dictionary. Here’s one that gets an explicit listing of all refs for all commits:
all.awk:all.awktook a few minutes to do the 322K linux kernel repo commits, about a thousand a second or something like that (lots of duplicate strings and redundant processing) so you’d probably want to rewrite that in C++ if you’re really after the complete cross-product … but I don’t think gitk shows that, only the nearest neighbors, right?