I am using Python 2.6, and am having difficulty understanding why the following code is throwing an IndexError at the location it is being thrown. The error occurs (incredibly rarely) when this version of the Porter Stemmer is incorporated into a web service.
The code involves a series of “if-elif-elif-else” statements which check an index of the input word for a series of scenarios. Note that self.k is an integer value (a placeholder) and self.ends(val) returns either a 0 or 1.
if self.b[self.k - 1] == 'a':
if self.ends("al"): pass
else: return
elif self.b[self.k - 1] == 'c':
if self.ends("ance"): pass
elif self.ends("ence"): pass
else: return
...additional "elifs" appear here, but none modify self.b or self.k ...
elif self.b[self.k - 1] == 's':
if self.ends("ism"): pass
else: return
But, rarely (the input is highly variable), one of the “elif” statements throws an IndexError. For example:
line 290, in step4
elif self.b[self.k - 1] == 's':
IndexError: string index out of range
What I cannot understand is why the evaluation of an “elif” is throwing an IndexError rather than the initial “if” statement? I do not yet have data on what input is throwing the error (again, the occurrence is very rare). It’s also possible that the stemmer is receiving some type of “bad” input… Is there anything that I am missing/should be aware of with respect to Python if-elifs? (I am aware that an “elif” cannot precede “if”…).
Thanks, and let me know if I can provide any additional information.
Also, if you’re interested in (most of) the full code, I’m using a modified version of this: http://tartarus.org/~martin/PorterStemmer/python.txt, but I don’t think this is relevant to my question.
If nobody else is modifying self.b or self.k, what’s the point of checking it over and over again? Store it in a variable before the first if, and use it:
Now, if someone else is modifying self.b or self.k in another thread, you should store it in a variable before your first if, and use it: