I’m working on a web application in ASP.NET MVC which involves a fairly complex (I think) search situation. Basically, I have a bunch of entries with a title and content. These are the fields that I want to provide full-text search for. The catch is that I also keep track of a rating on these entries (like up-vote/down-vote). I’m using MongoDB as my database, and I have a separate collection for all these votes. I plan on using a map/reduce function to turn all of the documents in the votes collection into a single “score” for the article. When I perform a search, I want the article’s score to be influential on the rankings.
I’ve been looking at many different full-text search services, and it looks like all the cool kids are using Lucene (and in my case, Lucene.NET). The problem is that since the score is not part of the document when I will first create the index, I don’t know how I would set up Lucene. Each time somebody votes for an article, do I need to update the Lucene index? I’m a little lost here.
I haven’t written any of this code yet, so if you have a better way to solve this problem, please share.
What the problem? Just use default value for rating/votes (probably 0) and later when peoples will vote up update it.
No, this can be expensive and slow. In your app probably will be huge volume of updates and lucene can be slow when you will do often flushes to the disk. In general almost for any full text search real time updates not so important as full text search. So i suggest following strategy:
Solution #1:
1.Create collection in mongodb where you will store all updates related to lucene:
2.After this you need create tool that will process all this updates in background (once per 10 minutes for example). Just take in the mind that you need flush data to the disc, say, after 10000 of lucene update/insert/delete to make lucene index updates fast.
With above solution your data can be stale for 10 minutes, but inserts will be faster.
Solution #2:
I would go with #1, because it is should be less expensive for the server.
Choose what you like more.