I am new to Python, and this is my first post in here, so I hope you will bear over with me. I am having big trouble reading a csv file into a desired format. My file consists of 132 columns, and the head of the file looks like this:
['10520', ' 386681375.82149398', ' 85.25775430', ' -56.07840500', ' 173', ' 153', ' 151', ' 161', ' 180', ' 167', ' 189', ' 171', ' 173', ' 171', ' 207', ' 169', ' 173', ' 168', ' 184', ' 168', ' 201', ' 197', ' 204', ' 201', ' 210', ' 239', ' 211', ' 227', ' 247', ' 248', ' 266', ' 276', ' 322', ' 336', ' 331', ' 381', ' 358', ' 483', ' 532', ' 709', ' 841', ' 1004', ' 1128', ' 1540', ' 1945', ' 2747', ' 3718', ' 5378', ' 6273', ' 8415', ' 12727', ' 18248', ' 24103', ' 33688', ' 40744', ' 52821', ' 65535', ' 59114', ' 55225', ' 49919', ' 51894', ' 58381', ' 50376', ' 48315', ' 42337', ' 30577', ' 24078', ' 24337', ' 22432', ' 20191', ' 19999', ' 17674', ' 22519', ' 22542', ' 22644', ' 23966', ' 21033', ' 21326', ' 20257', ' 20441', ' 21859', ' 26976', ' 32514', ' 34732', ' 45555', ' 48416', ' 34952', ' 28511', ' 24611', ' 18843', ' 17081', ' 14592', ' 13550', ' 13011', ' 15370', ' 15827', ' 15232', ' 16054', ' 14823', ' 14538', ' 12544', ' 11865', ' 11442', ' 10089', ' 10340', ' 11269', ' 11336', ' 11873', ' 10012', ' 9824', ' 9488', ' 7696', ' 9273', ' 9502', ' 8752', ' 8341', ' 8192', ' 8293', ' 8067', ' 8402', ' 9258', ' 9290', ' 8144', ' 8009', ' 7660', ' 6772', ' 6008', ' 6792', ' 6993', ' 6662', ' 7047', ' 6662 ']
['10520', ' 386681375.86699998', ' 85.25527360', ' -56.09263480', ' 113', ' 102', ' 120', ' 124', ' 117', ' 127', ' 124', ' 118', ' 128', ' 120', ' 125', ' 120', ' 140', ' 135', ' 144', ' 127', ' 143', ' 148', ' 141', ' 153', ' 142', ' 142', ' 149', ' 152', ' 168', ' 180', ' 196', ' 188', ' 196', ' 246', ' 259', ' 270', ' 337', ' 360', ' 506', ' 540', ' 625', ' 887', ' 1122', ' 1251', ' 2007', ' 2883', ' 3238', ' 4370', ' 6240', ' 9164', ' 10751', ' 16656', ' 20996', ' 27753', ' 37774', ' 35377', ' 38637', ' 39265', ' 35183', ' 38830', ' 32149', ' 25455', ' 27272', ' 24488', ' 21036', ' 20931', ' 17166', ' 17019', ' 18196', ' 15450', ' 15120', ' 15934', ' 15021', ' 14936', ' 16253', ' 16457', ' 15873', ' 19667', ' 23150', ' 26140', ' 35761', ' 42594', ' 61758', ' 65535', ' 42354', ' 28672', ' 25173', ' 20344', ' 15883', ' 14432', ' 10575', ' 11342', ' 12348', ' 13229', ' 19632', ' 23456', ' 18102', ' 15600', ' 13425', ' 9962', ' 8281', ' 7609', ' 6948', ' 7391', ' 8878', ' 10006', ' 11295', ' 10073', ' 9410', ' 10354', ' 10667', ' 10054', ' 9011', ' 8793', ' 9055', ' 7463', ' 6692', ' 8051', ' 8330', ' 7369', ' 6612', ' 6328', ' 6545', ' 6235', ' 5895', ' 5085', ' 4876', ' 5154', ' 4649', ' 5226', ' 6137', ' 5354 ']
and I am interested in getting:
- four lists/vectors/1D arrays (or what ever) of the four first colums.
- The next 128 columns I would like to get into an array.
- I would like to get the output without ([] , ‘ “) and other non-number-characters.
So fare the code looks like this
import sys, math, numpy
from numpy import *
from scipy import *
import csv
try:
ifile = sys.argv[1]
#; ofile = sys.argv[2]
except:
print "Usage:", sys.argv[0], "ifile"; sys.exit(1)
# Open and read file from std, and assign first four (orbit, time, lat, lon) columns to four lists, and last 128 columns (waveforms) to an array.
ifile = open(ifile)
orbit = []
time = []
lat = []
lon = []
#wvf= [[],[]]
try:
reader = csv.reader(ifile, delimiter=',')
for row in reader:
orbit.append(row[0])
time.append(row[1])
lat.append(row[2])
lon.append(row[3])
# wvf = [row[4:132] for row in reader] row[0:128] for col in len(reader)]
wvf = [row[4:132]],[row[1:128]]
finally:
ifile.close()
...and now do something with data...
I have thought about first splitting all lines, and thereafter gathering the last 128 columns into the array, but I haven’t managed to do it.
I hope your having an idea of what I am wishing to accomplish, and are able to help me out.
Thanks
You can load the file into a numpy array using np.genfromtxt. An advantage of doing it this way is that the data goes directly from the file to a space-efficient numpy array. If you use the
csvmodule, and store the data in Python lists, then your data will consume a lot more memory.Note that the variables
orbit,time, etc. are “views” ofdata— they are not copies ofdata, and so do not require (much) additional memory. This also means that modifyingorbitwill also affectdata, and vice versa.