I am building a full text search facility for my website coded in asp.net mvc with mysql database. This website is for a non-english language. I have started work on it using Lucense as the engine for searching the text, but I can’t find any info on whether it supports unicode?
Does anyone have any information on whether Lucene supports Unicode? I don’t want a nasty surprise..
Also links to beginner articles on implementing lucene.net will be appreciated.
Yes. It fully support unicode.
But for analyzing you should explicitly assign appropriate stemmers and correct stopwords.
As for sample. Here is copy from our last project
I’m querying Organization objects from NHibernate and put them into Lucene.NET
Here is simple search