I have:
%{ lorem ipsum dolor
sit %{hello
world}%
amet}%
I want:
hello
world
That is, I want to keep the inner %{...}% of any number of nesting %{...}%s that may or may not span multiple lines.
Is there a sed or awk way?
This
sedcommand:will gather the entirety of the input into the pattern space, then remove
...%{(taking care to ensure that the...doesn’t contain}%) and}%...(taking care to ensure that the...doesn’t contain%{), and then print the result. So it’s suitable for the case where you need just one block. The case with multiple blocks is trickier, but I’ll think about it further, and update this answer if I get that working well.Note that
-r(to support Extended Regular Expressions, instead of Basic ones) is a GNU extension tosed, so if you’re using a non-GNUsedthat doesn’t support it, let me know.Edited to add: O.K., here’s a version that supports multiple blocks:
It uses essentially the same approach as the previous, except that it only removes
...%{at start-of-input and}%...at end-of-input, and that after it’s done that, it proceeds to remove all instances of}%...%{that do not contain%{...}%, replacing them with a newline.