I’m creating a Facebook web application which functions in a similar way to a dating website, by getting users to provide information about themselves and information about their preferences in a matching user.
I am creating the database for this and have the following design in mind:
- Members Table: Contains the information about the users using their FB ID as the primary key.
- Preferences Table: Contains the information about the preferences that the user wants.
There are going to be approximately 20 fields which the user can specify a preference in but all of them will be optional. I am unsure about the best way to structure my “preferences” table and I currently have two ideas:
Solution 1: Use a foreign key of the facebook ID and have a new column for each field which can be matched on. A problem which I can see is that there will be a lot of “null” values in the database for fields which a value has not been specified.
- Do “null” values in a database occupy space or cause any other problems?
Solution 2: Use a foreign key of facebook ID again, but in the next two columns use a key-value pair approach – so one column would contain an ID for a user preference and the other would contain a value for it. For each user preference I would have a record with structure: “user ID” – “preference ID” – “value”.
- problem with this is that the type of value in the “value” column will depend on the contents of the “preference ID” column
My question:
- Which approach is better?
- Is there a standard schema solution to this sort of web application?
As you’ve noticed, either way you are looking at a compromise. Which way you go depends on what your production data is really going to look like.
Null values in sparse tables do take up a little bit of space, but not very much – as long as your columns use variable length data. Ten null varchars aren’t very long. Ten null ints are just as long as ten non-null ints.
If you add a third table “PreferableThings” that is keyed by the “preference ID” in your second solution, then what you have is not technically key-value pair or EAV, which most people shun. The difficulty, as you’ve noted, is that preferences with different data types have to be stored in a common encoded form (usually varchar). This solves the issue with sparse tables, but it forces you to create some application logic to decode from the common data type to the proper native data type. You can store the rule for doing this on the “PreferableThings” table.
Another advantage of the second approach, however, is that you can table-drive the addition of new preference options. With solution 1 you would need a schema change.