Like many other people posting questions here, I recently started programming in Python.
I’m faced with a problem trying to define the regular expression to extract a variable name (I have a list of variable names saved in the list) from a string.
I am parsing part of the code which I take line by line from a file.
I make a list of variables,:
>>> variable_list = ['var1', 'var2', 'var4_more', 'var3', 'var1_more']
What I want to do is to define re.compile with something that won’t say that it found two var1; I want to make an exact match. According to the example above, var should match nothing, var1 should match only the first element of the list.
I presume that the answer may be combining regex with negation of other regex, but I am not sure how to solve this problem.
OK, I have noticed that I missed one important thing. Variable list is gathered from a string, so it’s possible to have a space before the var name, or sign after.
More accurate variable_list would be something like
>>> variable_list = [' var1;', 'var1 ;', 'var1)', 'var1_more']
In this case it should recognize first 3, but not the last one as a var1.
It sounds like you just need to anchor your regex with
^and$, unless I’m not understanding you properly:So
^var1$will match exactlyvar1, but notvar1_textorvar1var1. Is that what you’re after?I suppose one way to handle your edit would be with
^\W*var1\W*$(wherevar1is the variable name you want). The\Wshorthand character class matches anything that is not in the\wclass, and\win Python is basically alphanumeric characters plus the underscore. The*means that this may be matched zero or more times. This results in:If you want the name of the variable without the extraneous stuff then you can capture it and extract the first capture group. Something like this, maybe (probably a bit inefficient since the regex runs twice on matched items):