I am building a database and I’m not sure if I need any special indexing tool, or just mysql index would suffice.
In my DB I will have about 1000 articles, each containing about 300 words. I will need to search for articles that contain most of the words from my query (e.g.: “walk, walked, school, studying” – I want to find articles that contain these words most times).
The articles will be HTML.
The application will be used by a few people (10) at a time = no extra requirements for superfast response, I just want it returned in reasonable time, like 1 sec.
So, do I need any extra tool for indexing (Apache Lucene/SOLR) or will mysql index do?
I can’t say im a MySql expert as I deal more with TSQL. However i’d say that just searching through the articles may take a while if they also include HTML as you have to take into account the tags which may or may not be malformed depending on how the HTML is saved.
Personally in the article table I’d have an extra column which would contain either the plain text version of the article, or some sort of result of a weighted algorithm which put in the most common 30 words in the article so that you have a much neater and streamline search field to use.
But for a 1000 articles this seems very much overkill and MySQL should do just fine if all your after is < 1s response time.