I have a code which takes as input two files: (1) a dictionary/lexicon (2)

Question

0

Editorial Team

Asked: June 12, 20262026-06-12T11:40:03+00:00 2026-06-12T11:40:03+00:00

I have a code which takes as input two files: (1) a dictionary/lexicon (2)

0

I have a code which takes as input two files:
(1) a dictionary/lexicon
(2) a text file (one sentence per line)

The first part of my code reads the dictionary in tuples so outputs something like:

('mthy3lkw', 'weakBelief', 'U')

('mthy3lkm', 'firmBelief', 'B')

('mthy3lh', 'notBelief', 'A')

The second part of the code is to search each sentence in the text file for the words in position 0 in those tuples and then print out the sentence, the search word and it’s type.

So given the sentence mthy3lkw ana mesh 3arif , desired output is:

[“mthy3lkw ana mesh 3arif”, ‘mthy3lkw‘, ‘weakBelief’, ‘U’] given that the highlighted word is found in the dictionary.

The second part of my code – the matching part – is TOO slow. How do I make it faster?

Here is my code

findings = [] 
for sentence in data:  # I open the sentences file with .readlines()
    for word in tuples:  # similar to the ones mentioned above
        p1 = re.compile('\\b%s\\b'%word[0])  # get the first word in every tuple
        if p1.findall(sentence) and word[1] == "firmBelief":
            findings.append([sentence, word[0], "firmBelief"])

print findings

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T11:40:04+00:00

Build a dict lookup structure so you can find the correct one from your tuples quickly. Then you can restructure your loops so that instead of going through your whole dictionary for each sentence, trying to match every entry up, you instead go over each word in the sentence and look it up in the dictionary dict:

# Create a lookup structure for words
word_dictionary = dict((entry[0], entry) for entry in tuples)

findings = []
word_re = re.compile(r'\b\S+\b') # only need to create the regexp once
for sentence in data:
    for word in word_re.findall(sentence): # Check every word in the sentence
        if word in word_dictionary: # A match was found
            entry = word_dictionary[word]
            findings.append([sentence, word, entry[1], entry[2]])

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a code which takes as input two files: (1) a dictionary/lexicon (2)

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply