First off, don’t link to the “Don’t parse HTML with Regex” post 🙂
I’ve got the following HTML, which is used to display prices in various currencies, inc and ex tax:
<span id="price_break_12345" name="1">
<span class="price">
<span class="inc" >
<span class="GBP">£25.00</span>
<span class="USD" style="display:none;">$34.31</span>
<span class="EUR" style="display:none;">27.92 €</span>
</span>
<span class="ex" style="display:none;">
<span class="GBP">£20.83</span>
<span class="USD" style="display:none;">$34.31</span>
<span class="EUR" style="display:none;">23.27 €</span>
</span>
</span>
<span style="display:none" class="raw_price">25.000</span>
</span>
An AJAX call returns a single string of HTML, containing multiple copies of the above HTML, with the prices varying. What I’m trying to match with regex is:
- Each block of the above HTML (as mentioned, it occurs multiple times in the return string)
- The value of the
nameattribute on the outermostspan
What I have so far is this:
var price_regex = new RegExp(/(<span([\s\S]*?)><span([\s\S]*?)>([\s\S]*?)<\/span><\/span\>)/gm);
console && console.log(price_regex.exec(product_price));
It matches the first price break once for each price break that occurs (so if there’s name=1, name=5 and name=15 it matches name=1 3 times.
Whereabouts am I going wrong?
Thanks in large part to jfriend for making me realise why my regex was matching in a strange way (
while (price_break = regex.exec(string))instead of just exec’ing it once), I’ve got it working:I had a ton of useless
()which were just clogging up the result set, so stripping them out made things a lot simpler.The other thing, as mentioned above was that originally I was just doing
which runs the regex once, and returns the first match only (which I mistook for returning 3 copies of the first match, due to the
()s). By looping over them, it keeps evaluating the regex until all the matches have been exhausted, which I assumed it did normally, similar to PHP’spreg_match.