def boolean_search_and(self, text): results = [] and_tokens = self.tokenize(text) tokencount = len(and_tokens) term1 =

Question

0

Asked: May 16, 20262026-05-16T22:12:41+00:00 2026-05-16T22:12:41+00:00

def boolean_search_and(self, text): results = [] and_tokens = self.tokenize(text) tokencount = len(and_tokens) term1 =

0

def boolean_search_and(self, text):


    results = []
    and_tokens = self.tokenize(text)
    tokencount = len(and_tokens)

    term1 = and_tokens[0]
    print ' term 1:', term1

    term2 = and_tokens[1]
    print ' term 2:', term2

    #for term in and_tokens:
    if term1 in self._inverted_index.keys():
        resultlist1 = self._inverted_index[term1]
        print resultlist1
    if term2 in self._inverted_index.keys():
        resultlist2 = self._inverted_index[term2]
        print resultlist2
    #intersection of two sets casted into a list                
    results = list(set(resultlist1) & set(resultlist2)) 
    print 'results:', results

    return str(results)

This code works great for two tokens, ex: text= “Hello World” and so, tokens = [‘hello’, ‘world’]. I want to generalize it for multiple tokens, so the text can be a sentence, or an entire text file.

self._inverted_index is a dictionary that saves the tokens as keys and the values are the DocIDs in which the keys/tokens occur.

hello -> [1,2,5,6]
world -> [1,3,5,7,8]
result:
hello AND world -> [1,5]

I want to achieve result for:
say,
(((hello AND computer) AND science) AND world)

I am working on making this work for multiple words, not just two. I started working in python this mornin’, so I’m unaware of a lot of features it has to offer.

Any ideas?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T22:12:42+00:00

Editorial Team

2026-05-16T22:12:42+00:00Added an answer on May 16, 2026 at 10:12 pm

I want to generalize it for multiple
tokens

def boolean_search_and_multi(self, text):
    and_tokens = self.tokenize(text)
    results = set(self._inverted_index[and_tokens[0]])
    for tok in and_tokens[1:]:
        results.intersection_update(self._inverted_index[tok])
    return list(results)

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

def boolean_search_and(self, text): results = [] and_tokens = self.tokenize(text) tokencount = len(and_tokens) term1 =

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply