I’m trying to write a script to pull the word count of many files

Question

0

Asked: June 1, 20262026-06-01T04:48:27+00:00 2026-06-01T04:48:27+00:00

I’m trying to write a script to pull the word count of many files

0

I’m trying to write a script to pull the word count of many files within a directory. I have it working fairly close to what I want, but there is one part that is throwing me off. The code so far is:

import glob

directory = "/Users/.../.../files/*"
output = "/Users/.../.../output.txt"

filepath = glob.glob(directory)

def wordCount(filepath):
    for file in filepath:
        name = file
        fileO = open(file, 'r')
        for line in fileO:
            sentences = 0
            sentences += line.count('.') + line.count('!') + line.count('?')

            tempwords = line.split()
            words = 0
            words += len(tempwords)

            outputO = open(output, "a")
            outputO.write("Name: " + name + "\n" + "Words: " + str(words) + "\n")

wordCount(filepath)

This writes the word counts to a file named “output.txt” and gives me output that looks like this:

Name: /Users/..../..../files/Bush1989.02.9.txt
Words: 10
Name: /Users/..../..../files/Bush1989.02.9.txt
Words: 0
Name: /Users/..../..../files/Bush1989.02.9.txt
Words: 3
Name: /Users/..../..../files/Bush1989.02.9.txt
Words: 0
Name: /Users/..../..../files/Bush1989.02.9.txt
Words: 4821

And this repeats for each file in the directory. As you can see, it gives me multiple counts for each file. The files are formatted such as:

Address on Administration Goals Before a Joint Session of Congress

February 9, 1989

Mr. Speaker, Mr. President, and distinguished Members of the House and
Senate…

So, it seems that the script is giving me a count of each “part” of the file, such as the 10 words on the first line, 0 on the line break, 3 on the next, 0 on the next, and then the count for the body of the text.

What I’m looking for is a single count for each file. Any help/direction is appreciated.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T04:48:29+00:00

The last two lines of your inner loop, which print out the filename and word count, should be part of the outer loop, not the inner loop – as it is, they’re being run once per line.

You’re also resetting the sentence and word counts for each line – these should be in the outer loop, before the start of the inner loop.

Here’s what your code should look like after the changes:

import glob

directory = "/Users/.../.../files/*"
output = "/Users/.../.../output.txt"

filepath = glob.glob(directory)

def wordCount(filepath):
    for file in filepath:
        name = file
        fileO = open(file, 'r')
        sentences = 0
        words = 0
        for line in fileO:
            sentences += line.count('.') + line.count('!') + line.count('?')

            tempwords = line.split()
            words += len(tempwords)

        outputO = open(output, "a")
        outputO.write("Name: " + name + "\n" + "Words: " + str(words) + "\n")

wordCount(filepath)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to write a script to pull the word count of many files

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply