Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8742035
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T11:25:44+00:00 2026-06-13T11:25:44+00:00

I am trying to implement a unranked boolean retrieval. For this, I need to

  • 0

I am trying to implement a unranked boolean retrieval. For this, I need to construct a tree and perform a DFS to retrieve documents. I have the leaf nodes but I am having difficulty to construct the tree.

Eg: query = OR ( AND (maria sharapova) tennis)

Result:

      OR
     |   |
     AND tennis
     | | 
  maria sharapova

I traverse the tree using DFS and calculate the boolean equivalent of certain document ids to identify the required document from the corpus. Can someone help me with the design of this using python? I have parsed the query and retrieved the leaf nodes for now.

EDIT: I am new here, so apologies for lacking clarity. I am basically trying to build a very naive search engine. So, the user enters any boolean query like: OR ( AND (maria sharapova) tennis). I have a corpus of wikipedia documents that gets displayed to the user depending on the query you type.

Till now, I have parsed the query to retrieve individual operators (like OR, AND, etc). And, the individual search terms(maria, tennis, etc). The code for parsing is just a function that would basically group all the operators and query terms as typed. i.e (maria sharapova), (tennis), OR, AND. I parsed this function this way so as to create a tree bottom-up. Now, using the inverted lists for the corresponding keywords like tennis, maria, sharapova, etc I perform the boolean operation with the inverted list to get a certain “documentid”. This documentid is then passed to an API which would then retrieve the correct wikipedia page.

Just to explain the topic in more detail, please refer to this document for more information about my problem in hand:
http://www.ccs.neu.edu/home/jaa/CSG339.06F/Lectures/boolean.pdf

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T11:25:45+00:00Added an answer on June 13, 2026 at 11:25 am

    First if you want a fancy syntax of your query language to support many operators, range queries or wildcard, you definitely should refer to lex/yacc solution as Joran pointed out.

    Second, from the lecture slides you posted, I think you care more about how to implement the boolean query model than constructing a tree in python. Then you don’t need to worry about the query itself. Suppose the query is well formatted as below:

    "OR ( AND ( maria sharapova ) tennis )"
    

    That is, you have space between operator (AND/OR) and keywords/parenthesis. Then you only need two stacks (without using DFS on tree-data-structure) to parse the query and get the combined search results from them.

    The first stack holds the operators (AND/OR) and the operands (e.g., maria, tennis). You treat the parenthesis as open/close condition to process the current operands on top of the stack. You only process the search operation when you see a close parenthesis ).

    The second stack holds the current search results.

    Let’s do a step-by-step demo using the above example.
    You scan the query from left to right.

    Step 1. You push the “OR” operator into the stack.

    +               +
    +               +
    +    OR         +
    + + + + + + + + +
    

    Step 2. You see an open parenthesis (, just skip it.

    Step 3. You push the “AND” operator into your stack. Now the stack looks like below:

    +               +
    +    AND        +
    +    OR         +
    + + + + + + + + +
    

    Step 4. You skip another (.

    Step 5. You push “maria” to your stack.

    Step 6. You push “sharapova” to your stack. Now the stack looks like below:

    +   sharapova   +
    +    maria      +
    +    AND        +
    +    OR         +
    + + + + + + + + +
    

    Step 7. You see a close parenthesis ). Now it’s time to do the first operation. You pop all items on top of the stack until you see an operator. Pop the operator as well to get the current operator. Now you process the search for “sharapova” and “maria” separately and combine the search results using the operator “AND”. Assume for “maria”, you get 3 doc ids: [1, 2, 3]. For “sharapova”, you get another 5 doc ids: [2, 3, 8, 9, 10]. After you combine the results with “AND”, you have [2,3] in the
    second stack which holds the current search results. The current situation looks like below: on the right it is the result buffer.

    +               +           +         +
    +               +           +         +
    +               +           +         +
    +    OR         +           +  [2,3]  +
    + + + + + + + + +           + + + + + +
    

    Step 8. You push tennis to the stack.

    +               +           +         +
    +               +           +         +
    +    tennis     +           +         +
    +    OR         +           +  [2,3]  +
    + + + + + + + + +           + + + + + +
    

    Step 9. You see another close parenthesis ). Again, you pop all items on top of the stack until you see “OR”. You start search using “tennis” and suppose you get resulting doc ids: [3, 5, 7]. At this time, you combine this result with the previous results in your buffer using operator “OR”, so that finally get doc ids: [2,3,5,7].

    My sample code is here. Note I simulate the searching and returning doc ids by randomly sample len(word) integers.

    The printout from the code shows step-by-step, how the system looks like before processing the current query item (1st column), the status of the result buffer(2nd column), the items in stack (3rd column) and the immediate search result (4th column).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have this weird kind of error. I am trying implement basic Euclidean algorithm
Trying to implement a search similar to here .This searches properties based on city,locality,property
Trying to implement an autocomplete based on this It looks like it's very straight
Trying to implement NSCopying for the first time, and I have a question about
Trying to implement what I thought was a simple concept. I have a user
Trying to implement this gallery on my website. http://coffeescripter.com/code/ad-gallery/ It is noted in the
I am trying implement an alphabetical range to perform a query in Solr 3.3.
Trying to implement a shell, mainly piping. I've written this test case which I
Im trying to implement this example, but with $.getScript: and for some reason, it
Im trying to implement a UnitofWork pattern using this Scott Allen tutorial My current

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.