I am trying to write a program that will read all of the words in a text document named, “GlassDog.txt”. Once the program reads the words it will need to remove all of the punctuations, as well as, making all of the letters lowercase. Then when the program is finished with all of this I would like it to print the word that it found and how many times it was used in the document.
Here is my code so far:
def run():
count = {}
for w in open('GlassDog.txt').read().split():
if w in count:
count[w] += 1
else:
count[w] = 1
for word, times in count.items():
print ("%s was found %d times" % (word, times))
run()
This code will read and display the words and the frequency of the words. However, I could not find a way on how to implement a code that would remove the punctuations and replace the uppercase letters with lowercase letters. This question has probably been asked a few times, I just couldn’t seem to find anything that does specifically what I am looking for. I apologize if this is a repeat question.
you could user .lower() on the string to convert to lowercase before the if block and for matching only alphanumeric try a regular expression take a look specifically at \w