I’ve been stuck on finding a reg expression to split the html element into

Question

0

Asked: June 8, 20262026-06-08T23:46:31+00:00 2026-06-08T23:46:31+00:00

I’ve been stuck on finding a reg expression to split the html element into

0

I’ve been stuck on finding a reg expression to split the html element into 2 section. First would be the price and second number of downloads. Here is my HTML and here is the reg expression i tried using. I’m using a scraper program so I cant use java-script or jQuery.

HTML:

<h2>$850 / 3Downlaods - Software Name</h2>

Re Expression used Marker before:

/$\/\s*/

Re Expression used Marker After:

/\/\

this should return 850 only. No dollar sign. Im stuck on how to start and end the number of downloads. I need another set of Before and After regex’s to pull the number of download. Also exclude the word "downloads".

The program Im using is OutWit Hub Scraper Link to docs

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T23:46:33+00:00

If there will be no other nested tags inside the <h2> (which are more complicated to account for) two () capture groups separated by / should do it:

/<h2>\s*\$(\d+)\s*\/\s*(\d+)\s*Downloads.+?<\/h2>/

This breaks down as <h2>, optional whitespace \s*, $, some number of digits (\d+) to capture, more optional whitespace on either side of /, a group of digits to capture, more optional whitespace before Downloads, any characters (non-greedy) up to the closing </h2>.

If the price part may also include ,. the (\d+) group can be replaced by ([0-9.,]+) (or more be even more specific to make sure it doesn’t start with , if necessary, for example)

/<h2>\s*\$([0-9,.]+)\s*\/\s*(\d+)\s*Downloads.+?<\/h2>/

The usual warnings about using regular expressions to parse HTML apply here. This will only work successfully if your HTML input is rather predictable, with no nesting of tags inside the <h2>.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve been stuck on finding a reg expression to split the html element into

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply