How would I remove the anchor tag but keep the anchor text in Bash? So I want to remove everything except the words Example text.
<a href="http://example.com">Example text</a>
So if I do:
echo '<a href="http://example.com">Example text</a>' | sed -e 's/<[^>]*>//g'
That removes all HTML. I’m looking to remove just anchor tags but also retain the anchor text, “Example text” in this case.
You could use the following command:
Or, alternatively, you could also use
perlinstead ofsedsince a non greedy regular expression would be helpful here:Note: Using regex to parse HTML is discouraged, but for this small task I’d say it’s fine to stick to the tools available in the command line.
Edit: To remove just anchor tags, you can use the regular expression can be updated as follows: