I have (from the sed website http://sed.sourceforge.net/sed1line.txt) this one-liner:
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d'
Its purpose is to search a paragraph for either AAA, BBB or CCC.
My understanding of the script:
- ‘/./’ matches every line wich is not empty
- ‘{}’ all commands within the brackets handle the matched lines
- ‘H’ appends the holdspace with the matched lines
- ‘$!d’ delete from patternspace everything but the last line
- ‘x’ swaps the pattern- and holdspace
- ‘/AAA/!d’ search for AAA paragraph and print it
What is not clear to me:
- In the holdspace should be several separate lines (for each paragraph), why am I able to search the whole paragraph? Are the lines in the holdspace merged to one line?
- And how does sed know when one paragraph ends and the other begins in the holdspace?
- Why do I have to append ‘$!d’, why is not ‘$d’ sufficient? Why am I not able to omit the ‘-n’ and use ‘$p’ instead of ‘$!d’ in this case?
Thank you very much for every comment!
My test data (match every paragraph with XX in it):
YYaaaa
aaa1
aaa2
aXX3
aaa4
YYbbbb
bbb1
bbb2
YYcccc
ccc1
ccc2
ccc3
cXX4
ccc5
YYdddd
ddd1
dXX2
Following command is used:
sed -ne '/./{H;$!d};x;/XX/p' test2
Versions:
$ sed --version
GNU sed-Version 4.2.1
$ bash --version
GNU bash, Version 4.2.10(1)-release (x86_64-pc-linux-gnu)
It collects a paragraph as individual lines into the hold space (
H), then when you hit an empty line,/./fails and it falls through to thexwhich basically zaps the hold space for the next paragraph.In order to correctly handle the final paragraph, it needs to cope with a paragraph which is not followed by an empty line, therefore it falls through from the last line as if it were followed by an empty line. This is a common idiom for scripts which collect something up through a particular pattern (or, to put it differently, it’s a common error for such scripts to fail to handle the last collected data at end of file).
So in other words, if we are looking at a non-empty line, add it to the hold space, and unless it’s the last line in the file, delete it and start over from the beginning of the script with the next input line. (Perhaps your understanding of
dwas not complete? This is what$!dmeans.)Otherwise, we have an empty line, or end of file, and the hold space contains zero or more lines of text (one paragraph, possibly empty). Exchange them into the pattern space (the current, empty, line conveniently moves to the hold space) and examine the pattern space. If it fails to match one of our expressions, delete it. Otherwise, the default action is to print the entire pattern space.