I am trying to write naive bayes classifier and I keep getting this error:
Traceback (most recent call last):
File "<pyshell#30>", line 1, in <module>
import naive_assignment
File "C:\Python27\naive_assignment.py", line 655, in <module>
main()
File "C:\Python27\naive_assignment.py", line 650, in main
pans.append(p.classify(row))
File "C:\Python27\naive_assignment.py", line 597, in classify
less50Kcp = less50Kcp + self.less_cat_probs.get(query[4])
TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'
I ament sure how to fix it since most of the fixes out there say return something but already am in the code.
def classify(self, query):
less50Knp = 0.0
less50Kcp = 0.0
great50Knp = 0.0
great50Kcp = 0.0
less50Knp = less50Knp +self.less_num_prob_dist(float(query[1])/100)
less50Knp = less50Knp + self.less_num_prob_dist(float(query[3])/100)
less50Knp = less50Knp + self.less_num_prob_dist(float(query[5])/100)
less50Knp = less50Knp + self.less_num_prob_dist(float(query[11])/100)
less50Knp = less50Knp + self.less_num_prob_dist(float(query[12])/100)
less50Knp = less50Knp + self.less_num_prob_dist(float(query[13])/100)
less50Kcp = less50Kcp + self.less_cat_probs.get(query[2])
less50Kcp = less50Kcp + self.less_cat_probs.get(query[4])
less50Kcp = less50Kcp + self.less_cat_probs.get(query[6])
less50Kcp = less50Kcp + self.less_cat_probs.get(query[7])
less50Kcp = less50Kcp + self.less_cat_probs.get(query[8])
less50Kcp = less50Kcp + self.less_cat_probs.get(query[9])
less50Kcp = less50Kcp + self.less_cat_probs.get(query[10])
less50Kcp = less50Kcp + self.less_cat_probs.get(query[14])
less50K_prob = less50Kcp * less50Knp
great50Knp = great50Knp + self.great_num_prob_dist(float(query[1])/100)
great50Knp = great50Knp + self.great_num_prob_dist(float(query[3])/100)
great50Knp = great50Knp + self.great_num_prob_dist(float(query[5])/100)
great50Knp = great50Knp + self.great_num_prob_dist(float(query[11])/100)
great50Knp = great50Knp + self.great_num_prob_dist(float(query[12])/100)
great50Knp = great50Knp + self.great_num_prob_dist(float(query[13])/100)
great50Kcp = great50Kcp + self.great_cat_probs.get(query[2])
great50Kcp = great50Kcp + self.great_cat_probs.get(query[4])
great50Kcp = great50Kcp + self.great_cat_probs.get(query[6])
great50Kcp = great50Kcp + self.great_cat_probs.get(query[7])
great50Kcp = great50Kcp + self.great_cat_probs.get(query[8])
great50Kcp = great50Kcp + self.great_cat_probs.get(query[9])
great50Kcp = great50Kcp + self.great_cat_probs.get(query[10])
great50Kcp = great50Kcp + self.great_cat_probs.get(query[14])
great50K_prob = great50Kcp * great50Knp
if less50K_prob > great50K_prob:
return ' <=50K'
elif less50K_prob < great50K_prob:
return ' >50K'
else:
return 'unknown'
I know it isn’t the best way to code it.
The main function that calls this is:
def main():
data = getInputData('./trainingset.txt')
test = getInputData('./queries.txt')
p = nbayes(data)
p.train()
pans = []
for row in test:
pans.append(p.classify(row))
print("n-bayes")
print(pans)
main()
Does anyone have idea how to fix this?
self.less_cat_probs.get(query[4])apparently evaluates toNone– you need to check for this and avoid it, or fix the code that produces it.The error message explains this pretty well – it’s throwing an unsupported type error, and telling you you cannot add a
floatto aNoneTypeon the given line. As we can see thatless50Kcpis a float, the other item must beNone, hence the error, asNoneis not a number.A possible fix – presuming
self.less_cat_probsis a dict, would be to provideget()with a default value of0, so that the addition will still work when not finding a key. E.g:less50Kcp = less50Kcp + self.less_cat_probs.get(query[4], 0)There is the question, however, of whether this is the desired functionality – you might instead want to ensure you have entries in the dict where needed, and presumably you would want to repeat this fix across your results.
Please note that the code you have given us is a really bad example of copy/paste coding – this leads to more bugs, harder maintenance, more bugs, and a lot more typing. I’d highly recommend doing this properly, reducing repetitive code using loops and data structures, it will make it easier to spot bugs.