I am writing a Python program to read in a DOS tree command outputted into a text document. When I reach the 533th iteration of the loop, Eclipse gives an error:
Traceback (most recent call last):
File "E:\Peter\Documents\Eclipse Workspace\MusicManagement\InputTest.py", line 24, in <module>
input = myfile.readline()
File "C:\Python33\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3551: character maps to undefined
I have read other posts, and setting the encoding to latin-1 does not resolve this issue, as it returns a UnicodeDecodeError on another character, and the same with trying to use utf-8.
The following is the code:
import os
from Album import *
os.system("tree F:\\Music > tree.txt")
myfile = open('tree.txt')
myfile.readline()
myfile.readline()
myfile.readline()
albums = []
x = 0
while x < 533:
if not input: break
input = myfile.readline()
if len(input) < 14:
artist = input[4:-1]
elif input[13] != '-':
artist = input[4:-1]
else:
albums.append(Album(artist, input[15:-1], input[8:12]))
x += 1
for x in albums:
print(x.artist + ' - ' + x.title + ' (' + str(x.year) + ')')
You need to figure out what encoding
tree.comused; according to this post that could any of the MS-DOS codepages.You could go through each of the MS-DOS encodings; most of those have a codec in the python standard library. I’d try
cp437andcp500first; the latter is the MS-DOS predecessor of cp1252 I think.Pass the encoding to
open():You really should look into using
os.walk()instead of usingtree.comfor this task though, it’ll save you having to deal with issues like these at least.