What I need:
I’m designing the backend for a product library which has to satisfy the following requirements:
-
Multiple editors will be editing different items at the same time — there has to be some kind of item-level locking.
-
Wildly varying item properties — there are about a 100 subcategories, each of which can have 10+ item properties specific to itself.
-
The whole item store has to be versioned — multiple changes (insertions, edits and deletions) can be made before publishing the whole set of changes to the site; unpublishing must also be possible.
-
I must be able to search all the properties and filter by some of them — i.e. find a keyword anywhere in the library or find all products that satisfy a set of criteria — within a data set of at least 10MB (i.e. 5000 items, 2KB each,) and possibly twice that.
The solution should either be MySQL-specific or, better yet, vendor-agnostic.
What I’ve considered:
I’m considering using a single large XML object with all the items (to satisfy 2) stored in a database (to satisfy 3) but that makes 1 impossible and 4 difficult. I’ve used something like this before, but with smaller XML objects and no item-level locking.
The other solution I’m considering is a classic database solution using a separate table for each subcategory, which makes 1 and 2 trivial, but 3 and 4 rather difficult. It’s also a bit unwieldy considering the number of different subcategories and therefore number of different tables in the database, but I guess that can be automated.
Another possibility is a hybrid between the two, with a single large database table of all items. Each row would contain an XML object with all the item’s properties and additionally all the filtrable properties as table fields. This solves 1, 2 and partially solves 4 but leaves out full-text searching and still makes 3 rather difficult to achieve.
If you’ve made it so far:
I’ll probably have a few weeks to solve it, which should leave enough time for discussion. I’ll be very grateful for any and all thoughts and insights the SO community can provide. Thanks in advance.
Option 2 – Classic database solution as outlined by you works well for this case.
It takes care of 1, 2 [a bit difficult but you can overcome most of it by designing little gereric mannager], 3.
For point 4, I would suggest you explore using Apache Solr which can be easily integrated with RDBMS, can index data and it is 100 times more faster than SQL.