I have a shell script
find . -name "*.java" -print0 | xargs -0 grep -Lz 'regular_expression'
which outputs file names not matching the regexp in this way:
file1.java
file2.java
...
The way I understand, it works as follows: find find needed files and concatenate their names with \0. Then xargs split the output of find with \0 and feeds them to grep one-by-one.
Then I wanted to add one more stage and get only basename of the files. I modified the script:
find . -name "*.java" -print0 | xargs -0 grep -LzZ 'regular_expression' | xargs -0 basename
but got an error. I started investigating and made an temporary output:
find . -name "*.java" -print0 | xargs -0 grep -LzZ 'regular_expression' | xargs -0 echo basename
and got this:
basename ./file1.java ./file2.java ./subdir/file1.java ./subdir/file2.java
So, the filenames were not split by \0. I can’t get why they are split in case of xargs used with grep and not split in xargs with basename.
I got a workaround by using -n1 in the latter xargs. But still I don’t understand why I needed it (given I didn’t use in in the xargs with grep) and what this parameter does.
Hope you can explain to me what -n1 does and why I needed it in the latter usage and didn’t need it in the former with grep.
The filenames were split by
\0. The difference is in the commands you’re using.xargsnormally takes its standard input, breaks it into a list (here, by splitting on NUL), and then passes that list as extra arguments to your command. So when you do this:What actually runs is this:
Here, the
-zdoesn’t matter because it only affects howgrepreads stdin, and you’re not sending anything to its stdin.So, when you add another
xargsthat runsbasename, you get this:But while
grepwill take any number of filename arguments,basenameonly takes one and ignores the others.That’s where
-n 1comes in: it tellsxargsto break its list of arguments into chunks (of 1), and run the command multiple times. So what runs now is:And all the output is concatenated together onto stdout.