I am new to scripting and was trying to learn how to extract any text that exists between two different patterns. However, I am still not able to figure out how to extract text between two patterns in the following scenario:
If I have my input file reading:
Hi I would like
to print text
between these
patterns
and my expected output is like:
I would like
to print text
between these
i.e. my first search pattern is “Hi’ and skip this pattern, but print everything that exists in the same line following that matched pattern. My second search pattern is “patterns” and I would like to completely avoid printing this line or any lines beyond that.
I tried the following:
sed -n '/Hi/,/patterns/p' test.txt
[output]
Hi I would like
to print text
between these
patterns
Next, I tried:
`awk ' /'"Hi"'/ {flag=1;next} /'"pattern"'/{flag=0} flag { print }'` test.txt
[output]
to print text
between these
Can someone help me out in identifying how to achieve this?
Thanks in advance
You have the right idea, a mini-state-machine in
awkbut you need some slight mods as per the following transcript:Or, in compressed form:
The output of that is:
as requested.
The way this works is as follows. The
echovariable is initially0meaning that no echoing will take place.Each line is checked in turn. If it contains
patterns, echoing is disabled.If it contains
Hifollowed by a space, echoing is turned on andgsubis used to modify the line to get rid of everything up to theHi.Then, regardless, the line (possibly modified) is echoed when the
echoflag is on.Now, there’s going to be edge cases such as:
Hi; orpatterns.You haven’t specified how they should be handled so I didn’t bother, but the basic concept should be the same.