I’m trying to parse a web page. Basically it gets stored in a string

Question

0

Asked: May 26, 20262026-05-26T17:12:28+00:00 2026-05-26T17:12:28+00:00

I’m trying to parse a web page. Basically it gets stored in a string

0

I’m trying to parse a web page.
Basically it gets stored in a string that will look like this:

"[HTML CODE ...]world:[HTML CODE ...]my_number[REST OF HTML_CODE ...]"

Of course “world:” and “MY_NUMBER” are part of the html code, however I would like to ignore everything before the first occurrence of “world:”. What I need is the first number that appears after the first occurrence of “world:”, keeping in mind that a bunch of html code will be between those.
I could substring the html code but I would like to do this all just by using a single regex if possible.

This is the regular expression I tried to match:

'/(?<=world:)\D+?[0-9]+/'

But this returns me all the html stuff between “world:” and my number.

Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T17:12:29+00:00

I think you were close to getting it. I was able to use this on the string you provided.

$subject = "[HTML CODE ...]world:[HTML CODE ...]3334[REST OF HTML_CODE ...]";
$pattern = "/world:\D+?(?<my_number>[0-9]+)/";
$matches = array();

$result =  preg_match_all($pattern, $subject, &$matches);

print_r($matches);

Results in:

Array
(
    [0] => Array
        (
            [0] => world:[HTML CODE ...]3334
        )

    [my_number] => Array
        (
            [0] => 3334
        )

    [1] => Array
        (
            [0] => 3334
        )

)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to parse a web page. Basically it gets stored in a string

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply