I am developing a classifieds website similar to Quickr.com.
The main problem is that each category requires a different set of properties. For example, for a mobile phone the attributes might be Manufacturer, Operating System, Is Touch Screen, Is 3G enabled etc… Whereas for an apartment the attributes are Number of bedrooms, Is furnished, Which floor, total area etc. Since the attributes and the number of attributes varies for each category, I am keeping the attributes and their values in separate tables.
My current database structure is
Table classifieds_ads
This table stores all the ads. One record per ad.
ad_id
ad_title
ad_desc
ad_created_on
cat_id
Sample data
-----------------------------------------------------------------------------------------------
|ad_id | ad_title | ad_desc | ad_created_on | cat_id |
-----------------------------------------------------------------------------------------------
|1 | Nokia Phone | Nokia n97 phone for sale. Excellent condition | <timestamp> | 2 |
-----------------------------------------------------------------------------------------------
Table classifieds_cat
This table stores all the available category. cat_id in classifieds_ads table relates to cat_id in this table.
cat_id
category
parent_cid
Sample data
-------------------------------------------
|cat_id| category | parent_cid |
-------------------------------------------
|1 | Electronics | NULL |
|2 | Mobile Phone | 1 |
|3 | Apartments | NULL |
|4 | Apartments - Sale | 3 |
-------------------------------------------
Table classifieds_attribute
This table contains all the available attributes for a particular category. Relates to classifieds_cat table.
attr_id
cat_id
input_type
attr_label
attr_name
Sample data
-----------------------------------------------------------
|attr_id | cat_id | attr_label | attr_name |
-----------------------------------------------------------
|1 | 2 | Operating System | Operating_System |
|2 | 2 | Is Touch Screen | Touch_Screen |
|3 | 2 | Manufacturer | Manufacturer |
|4 | 3 | Bedrooms | Bedrooms |
|5 | 3 | Total Area | Area |
|6 | 3 | Posted By | Posted_By |
-----------------------------------------------------------
Table classifieds_attr_value
This table stores the attribute value for each ad in classifieds_ads table.
attr_val_id
attr_id
ad_id
attr_val
Sample data
---------------------------------------------
|attr_val_id | attr_id | ad_id | attr_val |
---------------------------------------------
|1 | 1 | 1 | Symbian OS |
|2 | 2 | 1 | 1 |
|3 | 3 | 1 | Nokia |
---------------------------------------------
========
- Is this design okay?
- Is it possible to index this data with solr?
- How can I perform a faceted search on this data?
- Does MySQL support field collapsing like solr?
Your design is fine, although I question why you are using hierarchical categories. I understand that you want to organize categories from an end-user standpoint. The hierarchy helps them drill down to the category that they are looking for. However, your schema allows for attribute values at every level. I would suggest that you only need (or possibly want) attributes at the leaf level.
It is certainly possible that you could come up with attributes that would be applicable at higher levels, but this would drastically complicate your management of the data since you’d have to spend a lot of time thinking about exactly how high up the chain a certain attribute belongs and whether or not there is some reason why a lower level might be an exception to the parent rule and so forth.
It also certainly over complicates your retrieveal as well – which is part of the reason for your question, I think.
I would suggest creating an additional table that will be used to manage the hierarchy of categories above the leaf level. It would look exactly like your
classifieds_cattable except the involuted relationship will obviously be to the new table. Thenclassifieds_cat.parent_cidbecomes an FK to the new table rather than an involuted FK toclassifieds_cat.I think this schema change will reduce your application and data management complexity.