Given a file with the following contents:
<root>
<a></a>
<b></b>
</root>
The command should output:
<root>
<a></a>
<b></b>
Things I’ve tried using the GNU Win32 port of sed:
Remove the last two lines.
This is fast, but it assumes </root> is the second to last line and will cause a bug if it’s not.
sed -e '$d' test.xml | sed -e '$d'
Substituting all occurrences of </root> with an empty string.
This works, but is slower than the first solution, and will break if there are nested <root> elements (unlikely).
sed -e 's|</root>||' test.xml
The file I’m dealing with can be large so efficiency is important.
Is there a way to limit sed substitution to the last occurrence in the file? Or is there some other utility that would be faster?
Using Perl with File::Backwards should be very fast (relative, I know, but still…). Perlfaq5 has a topic on going through a file backwards and removing lines. You can check for your pattern using this topic’s code as a starting point.