I am running into an issue where I have some data in a .xls file(example below).
A B C D E F
John Smith 8:00AM 9:00AM 10:00AM 5:00PM 8.00
When I write it to a csv using the Python CSV module it comes out as
John,Smith,0.333333333,0.375,0.416666667,0.708333333,0.333333333
Now the interesting part is if I manually save the xls file as a MSDOS csv I get the desired output of
John,Smith,8:00 AM,9:00 AM,10:00 AM,5:00 PM,8:00
Here is the function I am running. Its a bit messy so I apologize in advance.
def csv_gen(filepath, saveto):
for files in glob.glob("*.xls"):
shutil.copy(filepath + "\\" + files, saveto)
with xlrd.open_workbook(files) as wb:
sh = wb.sheet_by_index(0)
newfile = saveto + files[:-4] + '.csv'
now = datetime.datetime.now()
dates = now.strftime("%m-%d-%Y")
filestart = [saveto + files]
time = [dates]
with open(newfile, 'wb') as f:
c = csv.writer(f,delimiter=',')
list = range(sh.nrows)
last = range(sh.nrows)[-1]
list.remove(0)
list.remove(3)
list.remove(2)
list.remove(1)
list.remove(last)
#Iterate through data and show values of the rows
for r in list:
lines = sh.row_values(r)
del lines[:4]
stuff = lines + filestart + time
#Remove blanks so csv doesnt have uneeded data
if lines[0] is '':
del stuff[:]
#Write to csv file with new data
if any(field.strip() for field in stuff):
c.writerow(stuff)
shutil.move(newfile, mergeloc)
I don’t understand why this is coming out this way. I have tried adding the dialect flag to the csv writer to be ‘excel’, but the output is still the same.
Update:
If I save the document as a csv as so workBook.SaveAs(test.csv, 24) The encoding 24 is for MSDOS. I get the desired output of
John,Smith,8:00 AM,9:00 AM,10:00 AM,5:00 PM,8:00
But when the csv module grabs it and removes some blank rows and deletes a few things at the end it writes the rows out and that is when I get the decimals again
John,Smith,0.333333333,0.375,0.416666667,0.708333333,0.333333333
The purpose of the csv module is to modify rows and delete blank rows.
Update
for r in list:
cells = sh.row_values(r)
csv_row = cells[0] for col_value in cells[1:]:
csv_row.append(datetime.time(*xlrd.xldate_as_tuple(col_value, 0)[3:]))
Added the row_values to just return the value of the cell and not xldata:0.33333. Then added a * to make the pass a positional argument.
That doesn’t look like a problem in
csvmodule to me, it looks like something is going wrong in reading the.xlsfile.According to the
xlrddocs dates in Excel worksheets are a pretty awful messI did a quick test with a new
.xlsfile with the contents you provided in there. Python has no problems reading the file, although I don’t have Excel on my machine, I made the file in LibreOffice and saved it as .xls. Even so, the fields come out as unicode strings on the python side.You should be able to use the
xlrd.xldate_as_tuple(xldate, datemode)(link) to convert the float into a python date tuple. Doingprints out
UPDATE
So you probably want something like the following, changing the
forloop that goes over your rows