Suppose I have a string like this:
<code>Blah blah Blah
enter code here</code>
<code class="lol">enter code here
fghfgh</code>
I want to use javascript to replace all occurences between the <code> tags with a callback function that html encodes it.
This is what I have currently:
function code_parsing(data){
//Dont escape & because we need that... in case we deliberately write them in
var escape_html = function(data, p1, p2, p3, p4) {
return p1.replace(/</g, "<").replace(/>/g, ">").replace(/"/g, """).replace(/'/g, "'");
};
data = data.replace(/<code[^>]*>([\s\S]*?)<\/code>/gm, escape_html);
// \[start\](.*?)\[end\]
return data;
};
This function is unfortunately removing "<code>" tags and replacing them with just the content. I would like to keep the <code> tags with any number of attributes. If I just hardcode the <code> tag back into it, I will lose the attributes.
I know regex isn’t the best tool, but there won’t be any nested elements in it.
You shouldn’t use regular expressions to parse HTML.
That said, you need to capture the content you want to preserve using a parenthetical group and have your replacer append that to the bit you manipulate.
To understand why you shouldn’t use regular expressions to parse HTML, consider what this does to