I am trying to parse a badly formed html table:
A couple of lines of this are:
Food:</b> Yes<b><br>
Pool: </b>Beach<b></b><b><br>
Centre:</b> Yes<b><br>
After spending a lot of time on this with Xpath, I think it is probably better to split the above text into lines use preg_split and parse from there.
The pattern I think would work uses:
<\b><\br>*: <\b>
my code is as follows:
$pattern='</b></br>*:</b>';
$pattern=preg_quote($pattern,'#');
$chars = preg_split($pattern, $output);
print_r($chars);
I am getting the following error:
Delimiter must not be alphanumeric or backslash
What I am doing wrong?
Try this:
The
preg_quotefunction just makes it safely escaped, it doesn’t actually add the delimiters for you.As other people will surely point out, using regular expressions is not a good way to parse HTML 🙂
Your regular expression is also not going to match what you hope. Here’s a version that will probably work for your input:
This removes all the HTML, then splits on the colon, and then removes any surrounding whitespace.