i have to process .txt files presnent in subfolder inside a Folder.like:
New Folder>Folder 1 to 6>xx.txt & yy.txt(files present in each folder)
each file contain two columns as:
arg his
asp gln
glu his
and
arg his
glu arg
arg his
glu asp
now what I have to do is :
1)count number of occurance of each word for each file > and average total count by dividing with total no. of lines in that file
2)then with values obtained after completing 1st step, divide the values with total no. of files present in the folder for averaging (i.e. 2 in this case)
I have tried with my code as follows:
but I have succeeded in 1st case but I’m not getting 2nd case.
for root,dirs,files in os.walk(path):
aspCount = 0
glu_count = 0
lys_count = 0
arg_count = 0
his_count = 0
acid_count = 0
base_count = 0
count = 0
listOfFile = glob.iglob(os.path.join(root,'*.txt')
for filename in listOfFile:
lineCount = 0
asp_count_col1 = 0
asp_count_col2 = 0
glu_count_col1 = 0
glu_count_col2 = 0
lys_count_col1 = 0
lys_count_col2 = 0
arg_count_col1 = 0
arg_count_col2 = 0
his_count_col1 = 0
his_count_col2 = 0
count += 1
for line in map(str.split,inp):
saltCount += 1
k = line[4]
m = line[6]
if k == 'ASP':
asp_count_col1 += 1
elif m == 'ASP':
asp_count_col2 += 1
if k == 'GLU':
glu_count_col += 1
elif m == 'GLU':
glu_count_col2 += 1
if k == 'LYS':
lys_count_col1 += 1
elif m == 'LYS':
lys_count_col2 += 1
if k == 'ARG':
arg_count_col1 += 1
elif m == 'ARG':
arg_count_col2 += 1
if k == 'HIS':
his_count_col1 += 1
elif m == 'HIS':
his_count_col2 += 1
asp_count = (float(asp_count_col1 + asp_count_col2))/lineCount
glu_count = (float(glu_count_col1 + glu_count_col2))/lineCount
lys_count = (float(lys_count_col1 + lys_count_col2))/lineCount
arg_count = (float(arg_count_col1 + arg_count_col2))/lineCount
his_count = (float(his_count_col1 + his_count_col2))/lineCount
upto this I could be able to get the average value per file. But how could I be able to get average per subfolder(i.e. by dividing with count(total no. of file)).
the problem is 2nd part. 1st part is done. The code provided will average values for each file. But I want to add this averages and make a new average by dividing with total no. of files present in the sub-folder.
Your use of
os.walktogether withglob.iglobis bogus. Either use one or the other, not both together. Here’s how I would do it: