Write a program to prompt for a file name, and then read through the file and look for lines of the form:
X-DSPAM-Confidence: 0.8475
When you encounter a line that starts with “X-DSPAM-Confidence:” pull apart the line to extract the floating point number on the line. Count these lines and the compute the total of the spam confidence values from these lines. When you reach the end of the file, print out the average spam confidence.
Enter the file name: mbox.txt
Average spam confidence: 0.894128046745
Enter the file name: mbox-short.txt
Average spam confidence: 0.750718518519
Test your file on the mbox.txt and mbox-short.txt files.
So far I have:
fname = raw_input("Enter file name: ")
fh = open(fname)
for line in fh:
pos = fh.find(':0.750718518519')
x = float(fh[pos:])
print x
What is wrong with this code?
It sounds like they’re asking you to average all the ‘X-DSPAM-Confidence’ numbers, rather than find
0.750718518519.Personally, I’d find the word you’re looking for, extract the number, then put all these numbers into a list and average them at the end.
Something like this –
Using find:
We can see that
find()just gives us the position of'X-DSPAM-Confidence:'in each line, not the position of the number after it.It’s easier to find if a line starts with
'X-DSPAM-Confidence:', then extract just the number like this: