I am using MySQL database for my webapp.
I need to search over multiple tables & multiple columns, it very similar like full text searching inside those columns.
I need know your experience of using any Full Text Search API (eg. solr/lucene/mapReduce/hadoop etc..) over using simple SQL in terms of :
- Speed performance
- Extra space usage
- Extra CPU usage (is it continuously building index? )
- How long it takes to build index or it get ready for use?
- Please let me know your experience of using these frameworks.
Thanks a lot!
To answer your questions
1.) i have an database with round about 5 Million Docs. MySQL Fulltextsearch needs 2-3 Minutes. Solr/Lucene needs for the same search round about 200-400 milliseconds.
2.) The space you need depends on your configuration, the number of copyfields and if you store the data or if you only index the data. In my configuration, full DB is indexed, but only metadata is sored. So an 30GB DB needs 40 GB on for Solr/Lucene. Keep in mind, that if you like to (re)optimize your index, you need temporary 100% of the index-size again.
3.) If you migrate from MySQL fulltext-Index to Lucene/Solr, you save CPU Power. Using MySQL Fulltext needs much more CPU Power than Solr Fulltext search -> look at answer 1.)
4.) depends on the number of documents, the size of the documents and the disk-speed. Of course the CPU performance is very important. There is not a good scaling over multiple CPU’s during index-time. 2 big cores are much more faster than 8 small cores.
Indexing 5 Million Docs (44GB) in my environment needs 2-3 hours on an dual core VM ware server.
5.) Migrating from MySQL Fulltext-Index to Lucene/Solr Fulltextindex was the best idea ever. 😉 But probably you have to redesign your application.
//Edit to answer the question “Will the Lucene Index get updated immediately after some Insert statements “
It depends on your SOlR configuration, but it is possible