I know sed or awk can tackle this kind of problem more elegantly perhaps. But I went the python way, so the problem is that I would like to renumber the first column of my data file from 1 to #of lines in the file. Is that a good idea to read the file by readlines? For small files perhaps, but large files not I suppose. So here is what I came up as a first attempt, any comments are appreciated.
#!/usr/bin/env python
import sys
try:
infilename = sys.argv[1]; outfilename = sys.argv[2];
except:
print "Usage is <script> inFile outFile"
ifile = open(infilename,'r')
ofile = open(outfilename, 'w')
lines = ifile.readlines();
i=1
for line in lines:
list = line.split();
list[0] = i
i += 1
for val in list:
ofile.write("%d " % int(val))
ofile.write('\n')
del list
ifile.close()
ofile.close()
You can iterate over the file to keep only the current line in memory:
@Umut Tabak:
("%d" % int(val) for val in parts)is a generator expression, they are kind of like lazy lists. It gives the same items as the list comprehension["%d" % int(val) for val in parts]but without actually creating the list.Btw, the for block can be written even shorter, but it’s slightly different because it doesn’t enforce that all lines are
ints anymore: