I’m trying to parse a web page.
Basically it gets stored in a string that will look like this:
"[HTML CODE ...]world:[HTML CODE ...]my_number[REST OF HTML_CODE ...]"
Of course “world:” and “MY_NUMBER” are part of the html code, however I would like to ignore everything before the first occurrence of “world:”. What I need is the first number that appears after the first occurrence of “world:”, keeping in mind that a bunch of html code will be between those.
I could substring the html code but I would like to do this all just by using a single regex if possible.
This is the regular expression I tried to match:
'/(?<=world:)\D+?[0-9]+/'
But this returns me all the html stuff between “world:” and my number.
Thanks!
I think you were close to getting it. I was able to use this on the string you provided.
Results in: