I have a site in which users can post some questions, so I a have a table in mysql like this
question_id, user_id, tags, views, creation_date
what I want is to be able to
-
perform searches which will return question_ids based on those
tagsand order them by
- Views
- date, (like newest, or this week, month)
- or searches for a specified user and return question_ids again
ordered by views and date.
In what way should I bring everything in solr, as far as indexing is concerned?
Will I have to index tags, views, date? What should I index so that I have maximal performance?
Think about, if using lucene/solr is relay a benefit for you. I don’t wanna be misunderstood, but if you like to search inside an column user_id for an specific user ID, you don’t need a addition fulltext-search engine.
Anyway – maybe you only like to have an little project to "play with" solr.
So here are the answers of your questions:
Put everything to solr/lucene, you need to search for. Use the DHI (data import handler) http://wiki.apache.org/solr/DataImportHandler to let solr walk trough your table and index the data.
Yes. You have to index all the things you like to work with.
btw: there is a difference between indexing and storing data. You can index fields (like tags, user_id, views,..) but you don’t need to store them (additional) inside your lucene index. Storing data is necessary, if lucene/solr have to return/deliver the searched data.
Otherwise, solr only returns the uniqueKey (primary key) of the matches documents and you have to fetch the data from the datebes (…where pk=< lucene result >)
So you don’t need to store those fields, which are only relevant for sorting (for example).
Index only those fields (columns), you need to work with (solr). Don’t index field you will never ask for / search for.