I have these files in a directory: y y1 y2 y3
Running this command:
ls y* | xargs -i basename {}|xargs -i sed "s/{}//g"
produces this:
1
2
3
Can someone explain why?! I expected it to produce nothing – running sed four times, once for each file, and removing the file name each time. But actually it looks like it’s applying sed with {} set to the first file, on a list of y1 y2 y3
This is Solaris 10
When I try this on my linux box, I get inconsistent results. Sometimes 123, sometimes (most of the times) 23, sometimes 12. This is a subtle buffering race condition between the rightmost
xargsand any of thesedit spawns.Dissecting the command line:
ls y*will output 4 lines, y, y1, y2 and y3; buffering not relevantxargs -i basename {}will read them and launch, in a sequence,basename y,basename y1,basename y2,basename y3; output, same as input in our case, is line-buffered as each line comes from a different process.xargs -i sed "s/{}//g", for each line X it reads (more on that later), launchessed "s/X//g"sed "s/X//g"filters out each X it sees in the lines it readsWhere it gets tricky: the last two commands read input from the same stream. That stream is produced by multiple different processes in a sequence. Depending on a multitude of factors (system load, scheduling), the output could come out in very different timing patterns.
Let’s suppose they’re all very fast. Then all four lines might be available for the right
xargsto read in a single block. In that case, there would no input left for any of theseds to read, hence no output at all.On the other hand, if they were very slow, there might be only one line available for the right
xargson its first read attempt. That line would be “y”.xargswould spawn the firstsedassed "s/y//g", which would consume all remaining input (y1, y2, y3), stripy‘s, and output 1, 2, 3. Here’s the same explanation again, with more explicit sequencing.basenamewrites “y”.xargsreads “y”, spawnssed s/y//g.xargsnow waits forsedto complete.basenamewrites “y1”;sedreads “y1”, writes “1”basenamewrites “y2”;sedreads “y2”, writes “2”basenamewrites “y3”;sedreads “y3”, writes “3”xargsis done;sedreads EOF and stopsxargstries to continue, reads EOF and stopsNot sure about my 12 case. Possibly GNU
xargsdoesn’t wait for its children to complete before it reads subsequent available input, and snatched the “y3” line from the firstsed.In any case, you just set up a pipeline with multiple concurrent readers on the same writer, which yields mostly undeterministic results. To be avoided.
If you wanted operation on each of the files, it would be avoided by specifying a filename to use by
sed(note the final {}):If what you wanted was a cross-product-type result (strip each file name from each file), you’d need to arrange to have the file list produced as many times as there are files. Plus one for
xargs, if you still used that.Hope this helps.