I need to find identifiers in a text file:
But I don’t want a match if the identifier is a keyword. For example, if I have “for” as a keyword, in the following:
for (i=0 ; i< max ; i++)
I should get:
Found: i
Found: i
Found: max
Found: i
I looked into look-ahead assertion, but I wan’t able to make it work:
$IDENTIFIER="(?!(for|while|do))[a-zA-Z_]+[a-zA-Z0-9_]*"
while ($entireFile =~ /($IDENTIFIER)/g)
{
print "Found ($1)" . "\n";
}
I get:
Found: or
Found: i
Found: i
Found: max
Found: i
This is not quiet what I want! I do understand why I get “or”, but how can I make it smarter and exclude “for” entirely?
You need anchoring to make sure that you’re matching an entire word (potential identifier). To a first approximation,
/\b(?!(?:for|while|do)\b)[A-Za-z_][A-Za-z0-9_]*\b/actually does what you want.