I am looking for a keyword indexing library for java. I found Lucene in google search. I think it is a very popular one but just wondering if it is the best (in terms of speed performance) indexing library (of course, it can be subjective but your opinion should be good enough for a beginner like me)? Is the example in this site http://snippets.dzone.com/posts/show/4020 good enough, or you have a better recommendation? Thanks in advance.
Share
We have tested Lucene (but .Net version) against MSSQL’s Full Text Search. It is rather difficult comparison, since both system provides indexing in incomparable way, but we do it for well defined task – index some product with multiple text fields (so fileds have different weight in search results) and provide user searching on these products.
Lucene wins because we have full control over compounding query, solve which indexes are in memory, and which are stored on filesystem, we have not been restrict by language pack (MSSQL FTS have limited list of supported languages). Lucene allows us use non-static noise word dictionary (for multiple product category we have used different set of noises).
So it is hard to talk about pure performance, but rich functional of Lucenr opens many ways for optimization.