For some reason I have to have one HTML tag per line. So if the following is the input:
<p><div class="class1 <%= "class3" %>class2">div content</div></p>
Output should be:
<p>
<div class="class1 <%= "class3" %>class2">div content
</div>
</p>
The regular expression should be able to recognize the difference between the erb script tag and HTML tag. Indentation is not needed.
How can this be done through regular expression?
You can replace
(?=<[\w/])with\n. This is a lookahed that matched the position before a<sign, the is followed by a letter or a slash. (another option is(?=<(?!%))).This works for your posted code, but fails on quite a few scenarios, notionally
<in attributes, or<in server-side scripts and JavaScript blocks. If you need anything more complex, you may need a stronger solution, like an erb parser.