I am trying to get the “date” from the second cell in the table using regex,
but it is not matching, and I really can’t find out why.
my $str = '"
<td class="fieldLabel" height="18">Activation Date:</td>
<td class="dataEntry" height="18">
10/27/2011
</td>';
if ( $str =~ /Activation Date.*<td.*>(.*)</gm ) {
print "matched: ".$1;
}else{
print "mismatched!";
}
Others have already pointed out that you want the
/soption to make.match a newline so you can cross logical line boundaries with.*. You might also want the non-greedy.*?:(2020 Update) But I’d use Mojo::DOM and CSS Selectors to get the date. The particular selector may depend on the complete HTML source, but the idea is the same:
If you have the complete table, it’s easier to use something that knows how to parse tables. Let a module such as There’s also HTML::TableParser handle all of the details:
There’s also HTML::TableExtract: