I am trying to parse an HTML file ( non strict one) using JavaScript
my output should be the same HTML file, but I need to process the internal content of any <script></script> tag. I have a method processScript(script) that does that..
I can assume that there will be no <script/> tags.
I have a pretty clear idea how to it using just split() but I wonder if I can do it better using regex?
Parsing HTML with Regex is generally not the best way to do it. Look into DOM parsing instead, using methods like
getElementsByName('script')and such. I’d also suggest looking at the w3schools examples on HTML DOM Objects to get you started in the right direction.There are a lot of reasons why this is a better approach, a few of them being that 1) Javascript has this DOM Object support already, and it is much easier than using Regex and 2) The language of matching open/close tags (similar to matching parens/brackets/etc) is not a regular language.