I’m creating a dictionary app that searches both a word and its definition (two separate Fields). However, using a StandardAnalyzer, no search results are returned when the search string contains whitespace or special characters.
For example, in my dummy dictionary data, searching “lorem” searches all words that have “lorem” in their definitions, but searching for “lorem ipsum” returns nothing, even though most of my dummy words have a lorem ipsum in the definition.
Also, searching for words like “make-believe” only returns results when typing “make,” but as soon as I include the dash, nothing is returned.
I want to include characters like whitespace, dashes, commas, whatever–basically everything in the search string (except perhaps the nonsense words like “and,” “at,” “by,” etc.), but what analyzer should I use? I’ve tried PatternAnalyzer and supplied .+ as the Pattern to look for but typing even just a single letter returns nothing.
I stuck with a
StandardAnalyzersince there doesn’t seem to be an alternative. What I did is tokenize the string via a regex that captures non-word characters, combine them in an ANDBooleanQuery, and combine the query for the twoFields in another ORBooleanQuery.In my code below,
entryis the word,descriptionis the definition, andsis the search string as aCharSequence.This is horrendously slow in my Android app right now but I can optimize it later. 🙂