I have a file with lines that contain:
<li><b> Some Text:</b> More Text </li>
I want to remove the html tags and replace the </b> tag with a dash so it becomes like this:
Some Text:- More Text
I’m trying to use sed however I can’t find the proper regex combination.
If you strictly want to strip all HTML tags, but at the same time only replace the
</b>tag with a-, you can chain two simplesedcommands with a pipe:This will pass all the file’s contents to the first
sedcommand that will handle replacing the</b>to a-. Then, the output of that will be piped to asedthat will replace all HTML tags with empty strings. The final output will be saved into the new filestripped_file.Using a similar method as the other answer from @Steve, you could also use
sed‘s-eoption to chain expressions into a single (non-piped command); by adding-i, you can also read-in and replace the contents of your original file without the need forcat, or a new file:This will do the replacement just as the chained-command above, however this time it will directly replace the contents in the input file. To save to a new file instead, remove the
-iand add> stripped_fileto the end (or whatever file-name you choose).