I wanna count words from text files which contain data as follows: ROK :

Question

0

Asked: May 25, 20262026-05-25T18:55:30+00:00 2026-05-25T18:55:30+00:00

I wanna count words from text files which contain data as follows: ROK :

0

I wanna count words from text files which contain data as follows:

ROK :
    ROK/(NN)
New :
    New/(SV)
releases, :
    releases/(NN) + ,/(SY)
week :
    week/(EP)
last :
    last/(JO)
compared :
    compare/(VV) + -ed/(EM)
year :
    year/(DT)
releases :
    releases/(NN)

The expressions like /(NN), /(SV), and /(EP) are considered category.
I wanna extract the words just before each of category and count how many words are in the whole text.

I wanna write a result in a new text file like this:

(NN)
releases 2
ROK 1

(SY)
New 1
, 1

(EP)
week 1

(JO)
last 1

......

Please help me out!

here is my garage code ;_; it doesn’t work.

import os, sys
import re

wordset = {}
for line in open('E:\\mach.txt', 'r'):
    if '/(' in line:
        word = re.findall(r'(\w)/\(', line)
        print word
        if word not in wordset: wordset[word]=1
        else: wordset[word]+=1

f = open('result.txt', 'w')
for word in wordset:
    print>> f, word, wordset[word]
f.close()

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T18:55:31+00:00

from __future__ import print_function                                                                                                                                                                                                                                  
import re                                                                                                                                                                                                                                                              


REGEXP = re.compile(r'(\w+)/(\(.*?\))')                                                                                                                                                                                                                                


def main():                                                                                                                                                                                                                                                            
    words = {}                                                                                                                                                                                                                                                         

    with open('E:\\mach.txt', 'r') as fp:
        for line in fp:                                                                                                                                                                                                                                                    
            for item, category in REGEXP.findall(line):                                                                                                                                                                                                                    
                words.setdefault(category, {}).setdefault(item, 0)                                                                                                                                                                                                         
                words[category][item] += 1                                                                                                                                                                                                                                 

    with open('result.txt', 'w') as fp:                                                                                                                                                                                                                                       
        for category, words in sorted(words.items()):                                                                                                                                                                                                                      
            print(category, file=fp)                                                                                                                                                                                                                                       
            for word, count in words.items():                                                                                                                                                                                                                              
                print(word, count, sep=' ', file=fp)                                                                                                                                                                                                                       
            print(file=fp)                                                                                                                                                                                                                                                 
    return 0                                                                                                                                                                                                                                                           

if __name__ == '__main__':                                                                                                                                                                                                                                             
    raise SystemExit(main())

You’re welcome (=
If you will want also count that weird “-ed” or “,”, tune regexp to match any character except whitespace:

REGEXP = re.compile(r'([^\s]+)/(\(.*?\))')

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I wanna count words from text files which contain data as follows: ROK :

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply