I’m trying to come up with a regex that will match anything that is not a 32bit integer. My eventual goal is to match lines that are not in the following format
Integer\tInteger\tInteger\tInteger\tInteger\tInteger\tInteger
(7 32bit integers and 1 tab in between each integer)
So far I’ve come up with this
#!/usr/bin/perl -w
use strict;
while ( my $line = <> ) {
if ( $line =~ /^(429496729[0-6]|42949672[0-8]\d|4294967[01]\d{2}|429496[0-6]\d{3}|42949[0-5]\d{4}|4294[0-8]\d{5}|429[0-3]\d{6}|42[0-8]\d{7}|4[01]\d{8}|[1-3]\d{9}|[1-9]\d{8}|[1-9]\d{7}|[1-9]\d{6}|[1-9]\d{5}|[1-9]\d{4}|[1-9]\d{3}|[1-9]\d{2}|[1-9]\d|\d)$/ ) {
print "Match at line $.\n";
print "$line"
}
}
But I can’t even get to the first step of having the regex match a 32bit numbers (once I tackle that problem I can tackle having the tabs be the way they need to be)
Am I solving this problem the right way? Any thoughts?
Assuming validation is actually needed, my first approach would be to split on tabs, check the number of fields, check each field but not by using a regex. Doing a range check in a regex is silly! (Padding using sprintf then doing a string compare would solve overflow problems.)
Other issues:
\dmatches far more than just 0-9. Use/\d/aor/[0-9]/if you want to match just 0-9.10.0an integer? Mathematically speaking, it is. Perl would also store that as an integer.