Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8945815
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T12:20:38+00:00 2026-06-15T12:20:38+00:00

For a script I need to compare ad titles against a lucene index. This

  • 0

For a script I need to compare ad titles against a lucene index.
This index contains a couple of keywords and the action to take if the ad matches.

For example:

(keyword,action,new_category,optional)
"red volvo","recategorize","cars","red"

The idea is that I need to query the whole ad title against the keyword field. Both (query and index) are analyzed with my own analyzer which has stemming, lowercasing, etc.

The problem I’m having is with partial matches. For example:
“I am selling a red horse” is matching “red volvo”.

If it were the other way around (the ads were indexed and I would need to query by the keyword) I could do:

q=+red +volvo

But that’s not an option due to the huge amount of ads I need to process.

So, the concrete question, is there a way to force all tokens in a field to be matched against the query?
I could use a KeywordAnalyzer so the whole ‘red volvo’ is seen as one token, but I cannot analyze the whole ad title as a single keyword, because it won’t match anything.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T12:20:40+00:00Added an answer on June 15, 2026 at 12:20 pm

    Given that you do want to catch the phrase “red volvo” exactly, but never just “red” or “volvo”, then I think you are on the right track with indexing it using the keyword analyzer. But you want to search with a longer query than than the field your searching, which is sort of the reverse of the typical use case.

    I hesitate to recommend it, but I think the right way to go about this query might be to use a different analyzer to query than the one you use to create the index.

    If the phrases indexed are of a predictable size, say 2-5 words, then using a ShingleFilter could produce the terms you need from a long query to search it as a Keyword.

    Something like this:

    Analyzer analyzer = new WhitespaceAnalyzer(Version.LUCENE_36);
    analyzer = new ShingleAnalyzerWrapper(analyzer, 1, 5); //wrapper that adds a ShingleFilter to the analyzer
    QueryParser parser = new StandardQueryParser(analyzer);  
    Query query = parser.parse(query, defaultField);
    searcher.search(query, 10);
    

    This will split only on whitespace, and then produce search terms of 1 to 5 tokens in length, so in the example: “I am selling a red horse” is will produce the terms like “I”, “am”, “I am”, “red horse”, “I am selling”, “am selling a red horse”, etc.

    I think a whitespace filter is probably the best choice for making this work with keywords, but if you run into whitespace characters it splits on other than spaces, or more than one space in a row, you may run into problems.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have this script and need to be able to call the $play variable
I need to compare (actually rank/sort) dates in a PHP script. The dates are
I need to compare chksum (NUM1 and NUM2) between file1 to file2 (see example
We need a script that will compare two directories of files and for each
I need a script to edit files. Im going a bit crazy about this
Im totally newbie in shell script. Im need compare file name in two directories
I need to write a script, probably in Ruby, that will take one block
In shell script i need to redirect output from dd command to /dev/null -
I have a bash script which need to execute some php scripts and to
I have a php script I need to run every 5 seconds (run, wait

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.