I have the fields “title” and “keyword” in my Lucene (3.6) documents. When I have an object with title=Testfair 2012-09 and a keyword of someTest, I write the document like:
Document doc = new Document();
doc.add(new Field("title", title, Field.Store.NO, Field.Index.ANALYZED));
doc.add(new Field("keyword", keyword, Field.Store.NO, Field.Index.ANALYZED));
For searching I use
QueryParser queryParser = new MultiFieldQueryParser(Version.LUCENE_36, new String[] { "title", "keyword" }, new StandardAnalyzer(Version.LUCENE_36));
queryParser.setDefaultOperator(QueryParser.AND_OPERATOR);
queryParser.setAllowLeadingWildcard(true);
Query query = queryParser.parse(queryString);
IndexSearcher searcher = createSearcher();
TopScoreDocCollector collector = TopScoreDocCollector.create(1000, true);
searcher.search(query, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
In the index, I can see (via Luke) that the field title in the index has the values "Testfair", "2012" and "09".
Now I’d like the following behaviour when searching:
Testfair 2012-09 -> match (1)
estfair -> match (2)
Testfair baz -> no match (3)
I am not sure how to handle this because I need the implicit wildcard search for case (2). If I split the search term at whitespaces and add * before and after every word, I get the search for +(title:*testmesse*) +(title:*2012-09*), so the 2012-09 is not split and no result is found. If I understand correctly the problem lies within the usage of the MultiFieldQueryParser but I don’t know how I would set up the search correctly or if I should somehow modify the indexing process.
Any help appreciated! Thanks!
In the meantime, I got the book “Lucene in Action” and took the advice from there: Create a “catch all” field which contains all searchable fields within one field and search that field only. With the help of this trick I can skip the
MultiFieldQueryParserbecause I only have one field to search in. Now I can simply analyze the search term and modify it the way I need it.