I’d like to extract the content from a large file of table cells using regexp and process the data using PHP.
Here’s the data I would like to match:
<td>Current Value: </td><td>100.178</td>
I tried using this regexp to match and retrieve the text:
preg_match("<td>Current Value: </td><td>(.+?)</td>", $data, $output);
However I get an “Unknown modifier” warning and my variable $output comes out empty.
How can I accomplish this – and could you give me a brief summary of how the solution works so I can try to understand why my code didn’t?
You need to add delimiters around your regex:
The standard delimiter is
/, but you can use other non-alphanumeric characters if you wish (which makes sense here because the regex itself contains slashes). In your case, the regex engine thought you wanted to use angle brackets as delimiters – and failed.One more tip (aside from the canonical exhortation “Thou shalt not parse HTML with regexen” (which I think is perfectly OK in a specific case like this)): Use
([^<>]+)instead of(.*?). This ensures that your regex will never travel across nested tags, a common source of errors when dealing with markup languages.