I am using the following code to combine two text files:
def combine_acpd_ccs(self, ccs_file, acps_file, out_file):
with open(ccs_file, 'r') as in_file1:
with open(acps_file, 'r') as in_file2:
with open(out_file, 'w') as out1:
out1.write('PDB\tPA\tEHSS\tACPS\n')
for line in in_file1:
segs = line.split()
for i in in_file2:
sse_score = i.split()
#print line
#print segs
if segs[0][:-4] == sse_score[0]:
out1.write(segs[0][:-4]+'\t'+segs[1]+'\t'+segs[2]+'\t'+sse_score[1]+'\n')
Example data looks like:
ccs_file:
1b0o.pdb 1399.0 1772.0
1b8e.pdb 1397.0 1764.0
acps_file:
1b0o 0.000756946316066
1b8e 8.40662008775
1b0o 6.25931529116
I expected my out put to be like:
PDB PA EHSS ACPS
1b0o 1399.0 1772.0 0.000756946316066
1b0o 1399.0 1772.0 6.25931529116
1b8e 1397.0 1764.0 8.40662008775
But my codes just generates the top two lines of my expected output. If I print segs in the second for loop only the first line in ccs_file is passed to the loop. Any ideas where I have gone wrong?
The problem is that you don’t reopen/rewind
in_file2after each iteration of the outer loop.Having executed
all subsequent attempts to iterate over
in_file2will do nothing, since the file pointer is already positioned at the end of the file.If the files are relatively small, you might want to load
ccs_fileinto memory, and just do dictionary lookups.