I have several HTML blocks on a page set up like:
<p class="something">
<a href="http://example.com/9999">text 1 2 3</a>
<a href="http://example.com/2346saasdf">text 3 4 5</a>
(9999)
<a href="http://example.com/sad3ws">text 5 6 7random</a>
</p>
I want to get the digit that is in the parentheses, with them. I have to admit I’ve never really used regex before — read about it, seen examples of it but haven’t used it myself. Anyway, I created this with a little bit of looking around:
<p class="something">(.*?)</p>
That correctly gets the entire <p> block, but again, I just want the (9999) (with parentheses intact). I’m not really sure how to get it.
Assuming that other elements on the page could also have digits in parentheses (but they won’t be included in this exact format), and that the HTML will remain valid and consistent, how can I get it?
I understand this is probably easy for someone who has used regex before, but for the solution, I’d appreciate a little detail on what each character captures so I can learn from it.
Don’t use regex to parse HTML.
Instead, use an HTML parser, then simply read the text (non-tag) content within the desired
<p>block.jQuery is a pretty decent HTML parser, so you can get the desired text stored in a variable
xusing:working example
If you can’t use jQuery to make your life easy for whatever reason, you can use raw JavaScript at the DOM:
working example