General question
How can I tell git, that it should also count empty lines in a diff, when using git log --stat?
Code example
git clone https://github.com/voldemort/voldemort.git
cd voldemort
git log --numstat -n 1 c21ad76 contrib/hadoop-store-builder/src/java/voldemort/store/readonly/mr/HadoopStoreBuilderReducer.java
git show c21ad76 -- contrib/hadoop-store-builder/src/java/voldemort/store/readonly/mr/HadoopStoreBuilderReducer.java
More details
In the given example git log --numstat claims for commit c21ad76, that file HadoopStoreBuilderReducer.java has 25 added and 22 removed lines . If you have a closer look at the diff output (git show) of that file you can see, that there are actually 30 added and 25 removed lines, which make it different by 5 added and 3 deleted lines. At an even closer look, there are 5 empty lines inside the added lines hunk and 4 empty lines in the deleted lines hunk.
This behavior is the same with git log --shortstat or git log --stat.
It appears to me, that all empty lines, which are inside an hunk are not counted by git log --numstat.
How can I calculate with git the number of added and removed lines per commit including blank lines?
Context
There are several different (valid) patches for the same change. The main difference is the use of context lines. Unified diff usually uses three lines of context before and after each change. Internally git (sometimes) uses zero lines of context which can result to different changed lines.
First solution: external tool
As @karl-bielefeldt already described, one can pipe the result of
git showintogrep -Pc '^\+(?!\+)'orgrep -Pc '^-(?!-)'. There is the tooldiffstatwhich does exactly that:Second solution: use different contexts for the patch
The output patch of git show can be configured. With the option “-Ux” for x a context can be specified.
This matches the internal
git log --numstatbecause it uses a 0 context for calculating the stat. Note that this behaviour is about to change in git version 1.7.7. With that numstat uses 3 lines of context.