I am trying to read lines from n files. Then I am going to print out all the data to ONE file. The tricky thing is that I don’t know how many files thats the dir contains and I want to print it out nice so every file gets it own column.
Example:(text is some data I don’t care about, can use split to grab the [1])
File 1 contains:
text Line1
text Line2
text Line3
File 2 contains:
text Line01
text Line02
text Line03
I want to combine to one file like this:
File 1 File 2
Line1 Line01
Line2 Line02
Line3 Line03
One problem I have is that when I read the files, I read one file at a time and append each line to a list, but then how do I print it out the way I want.
fromfiles = ['Line1','Line2','Line3','Line01','Line02','Line03']
or
fromfiles2 = [['Line1','Line2','Line3'],['Line01','Line02','Line03']]
In case formfiles: How do I print out line1 and line01 at the same time, and then continue?
In case formfiles2: same problem as above really. I need to access multiple elements at the same time without knowing how many items in list, and then print out everything.
I would be grateful if someone could help me with this problem.
zip()is made for this! Let’s start from yourfromfiles2:This is how
zipworks basically:Of course this also works with more than two arguments :-).
This is a good approach when your files are small. Then, you can safely read them into memory first, merge the data in memory, and then write the merged data to the output file. However, if your input is really large (GB of data), then you should read the input files line by line simultaneously, build an output line, write it to file and only then proceed with the next line in the input files.
If you got the concept of
zip, then you can look intoitertools.izipfor making things more memory-efficient:Also, if your files do not have the same number of input files, you might want to have a look at itertools.izip_longest: